Re: [Mesa-dev] [Nine] 'meson: add -Werror=empty-body to disallow `if(x); `' - 'broke' Nine

2019-10-28 Thread Axel Davy

Hi,

I don't remember we had any serious unexpected nine regression except 
build breaks so far.
I'd expect potential regressions to be very subtle, and hard to catch 
without a lot of tests.
Thus I don't think there is much interest to plug only a small testing 
too in the CI.


Axel

On 25/10/2019 13:09, Timur Kristóf wrote:

While we are at it:

Would it be possible to add some CI tests to ensure that Nine doesn't
break (even if it builds), similarly to how some drivers run their CTS
tests in there? For instance, can we run Xnine or some other small
testing tool in the CI?

What do you think about this, Axel?

Thanks & best regards,
Tim



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Nine] 'meson: add -Werror=empty-body to disallow `if(x); `' - 'broke' Nine

2019-10-25 Thread Axel Davy

Hi Dieter,

Maybe the best fix would be to change the definition of WARN and DBG 
when DEBUG is disabled.


The definitions are in nine_debug.h

I haven't tried by maybe using "(void)" instead of nothing would work ?

Yours,

Axel

On 24/10/2019 16:34, Dieter Nützel wrote:

Hello Eric,

your mentioned commit (8d43e2b2ded0fe3c82d49561cdab9f208f9e64b6) broke 
building with NIne (-Dgallium-nine=true) for me.


starting with
[-]
e_st@sta/cubetexture9.c.o' -c 
../src/gallium/state_trackers/nine/cubetexture9.c
../src/gallium/state_trackers/nine/cubetexture9.c: In function 
‘NineCubeTexture9_ctor’:
../src/gallium/state_trackers/nine/cubetexture9.c:108:43: error: 
suggest braces around empty body in an ‘if’ statement 
[-Werror=empty-body]

  108 | "but this is unimplemented\n");
  |   ^
cc1: some warnings being treated as errors

--
Next

/surface9.c.o' -c ../src/gallium/state_trackers/nine/surface9.c
../src/gallium/state_trackers/nine/surface9.c: In function 
‘NineSurface9_GetContainer’:
../src/gallium/state_trackers/nine/surface9.c:334:40: error: suggest 
braces around empty body in an ‘if’ statement [-Werror=empty-body]

  334 | DBG("QueryInterface FAILED!\n");
  |    ^
cc1: some warnings being treated as errors

--

@sta/swapchain9.c.o' -c ../src/gallium/state_trackers/nine/swapchain9.c
../src/gallium/state_trackers/nine/swapchain9.c: In function ‘present’:
../src/gallium/state_trackers/nine/swapchain9.c:737:51: error: suggest 
braces around empty body in an ‘if’ statement [-Werror=empty-body]

  737 | pSourceRect->top, pSourceRect->bottom);
  |   ^
../src/gallium/state_trackers/nine/swapchain9.c:741:47: error: suggest 
braces around empty body in an ‘if’ statement [-Werror=empty-body]

  741 | pDestRect->top, pDestRect->bottom);
  |   ^
cc1: some warnings being treated as errors

--

evice9.c.o' -c ../src/gallium/state_trackers/nine/device9.c
../src/gallium/state_trackers/nine/device9.c: In function 
‘NineDevice9_ctor’:
../src/gallium/state_trackers/nine/device9.c:296:49: error: suggest 
braces around empty body in an ‘if’ statement [-Werror=empty-body]

  296 | DBG("\033[1;32mCSMT is active\033[0m\n");
  | ^
../src/gallium/state_trackers/nine/device9.c: In function 
‘create_zs_or_rt_surface’:
../src/gallium/state_trackers/nine/device9.c:1221:87: error: suggest 
braces around empty body in an ‘if’ statement [-Werror=empty-body]
 1221 |   DBG("FIXME Used shared handle! This option isn't 
probably handled correctly!\n");

|  ^
../src/gallium/state_trackers/nine/device9.c: In function 
‘NineDevice9_UpdateSurface’:
../src/gallium/state_trackers/nine/device9.c:1307:53: error: suggest 
braces around empty body in an ‘if’ statement [-Werror=empty-body]

 1307 | pSourceRect->right, pSourceRect->bottom);
  | ^
../src/gallium/state_trackers/nine/device9.c:1309:68: error: suggest 
braces around empty body in an ‘if’ statement [-Werror=empty-body]
 1309 | DBG("pDestPoint = (%u,%u)\n", pDestPoint->x, 
pDestPoint->y);

|   ^
../src/gallium/state_trackers/nine/device9.c: In function 
‘NineDevice9_StretchRect’:
../src/gallium/state_trackers/nine/device9.c:1588:53: error: suggest 
braces around empty body in an ‘if’ statement [-Werror=empty-body]

 1588 | pSourceRect->right, pSourceRect->bottom);
  | ^
../src/gallium/state_trackers/nine/device9.c:1591:49: error: suggest 
braces around empty body in an ‘if’ statement [-Werror=empty-body]

 1591 | pDestRect->right, pDestRect->bottom);
  | ^
../src/gallium/state_trackers/nine/device9.c: In function 
‘NineDevice9_ColorFill’:
../src/gallium/state_trackers/nine/device9.c:1786:41: error: suggest 
braces around empty body in an ‘if’ statement [-Werror=empty-body]

 1786 | pRect->right, pRect->bottom);
  | ^
../src/gallium/state_trackers/nine/device9.c: In function 
‘NineDevice9_CreateOffscreenPlainSurface’:
../src/gallium/state_trackers/nine/device9.c:1864:43: error: suggest 
braces around empty body in an ‘if’ statement [-Werror=empty-body]

 1864 | DBG("Failed to create surface.\n");
  |   ^
cc1: some warnings being treated as errors

--

st@sta/nine_shader.c.o' -c 
../src/gallium/state_trackers/nine/nine_shader.c
../src/gallium/state_trackers/nine/nine_shader.c: In function 
‘tx_dst_param_as_src’:
../src/gallium/state_trackers/nine/nine_shader.c:1437:52: error: 
suggest braces around empty body in an ‘if’ statement 
[-Werror=empty-body]

 1437 | 

Re: [Mesa-dev] [PATCH] d3dadapter9: Revert to old throttling limit value

2019-06-03 Thread Axel Davy

I've got no comments on this,
but it should be safe to push and find a better solution later.

Axel

On 30/05/2019 12:43, Axel Davy wrote:



Thanks Juan for warning me it didn't make it to mesa-dev.
Here it is.


Axel

 Forwarded Message 
Subject:[PATCH] d3dadapter9: Revert to old throttling limit value
Date:   Sun, 26 May 2019 23:23:59 +0200
From:   Axel Davy 
CC: Axel Davy , mesa-sta...@lists.freedesktop.org



Recently PIPE_CAP_MAX_FRAMES_IN_FLIGHT was changed from 2
to 1:
20909284f204091757c050aa40cfffaf3f981b9c

No driver seems to overwrite the default value.

One user reports severe regressions for some games.
For now, revert to the value 2 for nine.

Cc: mesa-sta...@lists.freedesktop.org

Signed-off-by: Axel Davy 
---
src/gallium/targets/d3dadapter9/drm.c | 6 --
1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/gallium/targets/d3dadapter9/drm.c 
b/src/gallium/targets/d3dadapter9/drm.c

index b0b9bb12f2c..657c619ac42 100644
--- a/src/gallium/targets/d3dadapter9/drm.c
+++ b/src/gallium/targets/d3dadapter9/drm.c
@@ -243,8 +243,10 @@ drm_create_adapter( int fd,
return D3DERR_DRIVERINTERNALERROR;
}
- ctx->base.throttling_value =
- ctx->base.hal->get_param(ctx->base.hal, PIPE_CAP_MAX_FRAMES_IN_FLIGHT);
+ /* Previously was set to PIPE_CAP_MAX_FRAMES_IN_FLIGHT,
+ * but the change of value of this cap to 1 seems to cause
+ * regressions. */
+ ctx->base.throttling_value = 2;
ctx->base.throttling = ctx->base.throttling_value > 0;
driParseOptionInfo(, __driConfigOptionsNine);
--
2.21.0



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] d3dadapter9: Revert to old throttling limit value

2019-05-30 Thread Axel Davy


Thanks Juan for warning me it didn't make it to mesa-dev.
Here it is.


Axel

 Forwarded Message 
Subject:[PATCH] d3dadapter9: Revert to old throttling limit value
Date:   Sun, 26 May 2019 23:23:59 +0200
From:   Axel Davy 
CC: Axel Davy , mesa-sta...@lists.freedesktop.org



Recently PIPE_CAP_MAX_FRAMES_IN_FLIGHT was changed from 2
to 1:
20909284f204091757c050aa40cfffaf3f981b9c

No driver seems to overwrite the default value.

One user reports severe regressions for some games.
For now, revert to the value 2 for nine.

Cc: mesa-sta...@lists.freedesktop.org

Signed-off-by: Axel Davy 
---
src/gallium/targets/d3dadapter9/drm.c | 6 --
1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/gallium/targets/d3dadapter9/drm.c 
b/src/gallium/targets/d3dadapter9/drm.c

index b0b9bb12f2c..657c619ac42 100644
--- a/src/gallium/targets/d3dadapter9/drm.c
+++ b/src/gallium/targets/d3dadapter9/drm.c
@@ -243,8 +243,10 @@ drm_create_adapter( int fd,
return D3DERR_DRIVERINTERNALERROR;
}
- ctx->base.throttling_value =
- ctx->base.hal->get_param(ctx->base.hal, PIPE_CAP_MAX_FRAMES_IN_FLIGHT);
+ /* Previously was set to PIPE_CAP_MAX_FRAMES_IN_FLIGHT,
+ * but the change of value of this cap to 1 seems to cause
+ * regressions. */
+ ctx->base.throttling_value = 2;
ctx->base.throttling = ctx->base.throttling_value > 0;
driParseOptionInfo(, __driConfigOptionsNine);

--
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] st/dri: decrease input lag by syncing sooner in SwapBuffers

2019-05-26 Thread Axel Davy

hi,

I haven't done enough testing to have a full understanding of the issue, 
but it seems there could be a regression with gallium nine.


One user is reporting on irc huge slowdowns with d3d8 games (using the 
d3d8to9 wrapper) when using vsync. Reverting fixes the issues.


I don't know exactly why vsync would have issues.

Another thing though is that many d3d9 games do their own throttling, 
and if our throttling is stricter, maybe the game multithreading could 
be perturbed.

All in all, this needs more testing with gallium nine.

To prevent possible regression for next release, I'll force 
PIPE_CAP_MAX_FRAMES_IN_FLIGHT to be clamped to 2 at least for gallium nine.



Might be worth trying to see if gl could be affected.

Yours,

Axel

On 02/05/2019 03:19, Marek Olšák wrote:

If there is no other feedback, I'll push this tomorrow.

Marek

On Mon, Apr 29, 2019 at 6:12 PM Marek Olšák > wrote:


This patch might improve performance, because less submitted
unfinished work means less used memory by the unfinished work.

Marek

On Mon, Apr 29, 2019 at 11:07 AM Michel Dänzer mailto:mic...@daenzer.net>> wrote:

On 2019-04-27 6:13 p.m., Rob Clark wrote:
> On Thu, Apr 25, 2019 at 7:06 PM Marek Olšák
mailto:mar...@gmail.com>> wrote:
>>
>> From: Marek Olšák mailto:marek.ol...@amd.com>>
>>
>> It's done by:
>> - decrease the number of frames in flight by 1
>> - flush before throttling in SwapBuffers
>>   (instead of wait-then-flush, do flush-then-wait)
>>
>> The improvement is apparent with Unigine Heaven.
>>
>> Previously:
>>     draw frame 2
>>     wait frame 0
>>     flush frame 2
>>     present frame 2
>>
>>     The input lag is 2 frames.
>>
>> Now:
>>     draw frame 2
>>     flush frame 2
>>     wait frame 1
>>     present frame 2
>>
>>     The input lag is 1 frame. Flushing is done before
waiting, because
>>     otherwise the device would be idle after waiting.
>>
>> Nine is affected because it also uses the pipe cap.
>> ---
>>  src/gallium/auxiliary/util/u_screen.c         | 2 +-
>>  src/gallium/state_trackers/dri/dri_drawable.c | 20
+--
>>  2 files changed, 11 insertions(+), 11 deletions(-)
>>
>> diff --git a/src/gallium/auxiliary/util/u_screen.c
b/src/gallium/auxiliary/util/u_screen.c
>> index 27f51e0898e..410f17421e6 100644
>> --- a/src/gallium/auxiliary/util/u_screen.c
>> +++ b/src/gallium/auxiliary/util/u_screen.c
>> @@ -349,21 +349,21 @@
u_pipe_screen_get_param_defaults(struct pipe_screen *pscreen,
>>     case PIPE_CAP_MAX_VARYINGS:
>>        return 8;
>>
>>     case PIPE_CAP_COMPUTE_GRID_INFO_LAST_BLOCK:
>>        return 0;
>>
>>     case PIPE_CAP_COMPUTE_SHADER_DERIVATIVES:
>>        return 0;
>>
>>     case PIPE_CAP_MAX_FRAMES_IN_FLIGHT:
>> -      return 2;
>> +      return 1;
>
> would it be safer to leave the current default and let
drivers opt-in
> to the lower # individually?  I guess triple buffering would
still be
> better for some of the smaller gpu's?

This patch doesn't prevent triple buffering. The application
can still
prepare up to one frame worth of GPU commands before the GPU has
finished processing the commands of the previous frame (while the
pre-previous frame is being displayed).

I just think the term "in flight" should maybe be defined a
bit better,
but it's not a blocker and could be done in a follow-up patch.


-- 
Earthling Michel Dänzer               | https://www.amd.com

Libre software enthusiast             |             Mesa and X
developer


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [MR] Gallium nine fixes and improvements for Mesa 19.1

2019-04-27 Thread Axel Davy

Hi,


I usually send my patch series via mail, but as this one was rather 
long, and as people seem to find gitlab easier to work with for long 
patch series, this time I used the merge request system.



Don't hesitate to take a look:

https://gitlab.freedesktop.org/mesa/mesa/merge_requests/748


Yours,

Axel

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] st/dri: decrease input lag by syncing sooner in SwapBuffers

2019-04-27 Thread Axel Davy

On 27/04/2019 21:02, Rob Clark wrote:

On Sat, Apr 27, 2019 at 9:52 AM Axel Davy  wrote:

On 27/04/2019 18:13, Rob Clark wrote:

On Thu, Apr 25, 2019 at 7:06 PM Marek Olšák  wrote:

From: Marek Olšák 

It's done by:
- decrease the number of frames in flight by 1
- flush before throttling in SwapBuffers
(instead of wait-then-flush, do flush-then-wait)

The improvement is apparent with Unigine Heaven.

Previously:
  draw frame 2
  wait frame 0
  flush frame 2
  present frame 2

  The input lag is 2 frames.

Now:
  draw frame 2
  flush frame 2
  wait frame 1
  present frame 2

  The input lag is 1 frame. Flushing is done before waiting, because
  otherwise the device would be idle after waiting.

Nine is affected because it also uses the pipe cap.
---
   src/gallium/auxiliary/util/u_screen.c |  2 +-
   src/gallium/state_trackers/dri/dri_drawable.c | 20 +--
   2 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/src/gallium/auxiliary/util/u_screen.c 
b/src/gallium/auxiliary/util/u_screen.c
index 27f51e0898e..410f17421e6 100644
--- a/src/gallium/auxiliary/util/u_screen.c
+++ b/src/gallium/auxiliary/util/u_screen.c
@@ -349,21 +349,21 @@ u_pipe_screen_get_param_defaults(struct pipe_screen 
*pscreen,
  case PIPE_CAP_MAX_VARYINGS:
 return 8;

  case PIPE_CAP_COMPUTE_GRID_INFO_LAST_BLOCK:
 return 0;

  case PIPE_CAP_COMPUTE_SHADER_DERIVATIVES:
 return 0;

  case PIPE_CAP_MAX_FRAMES_IN_FLIGHT:
-  return 2;
+  return 1;

would it be safer to leave the current default and let drivers opt-in
to the lower # individually?  I guess triple buffering would still be
better for some of the smaller gpu's?

disclaimer: I haven't tested this myself or looked very closely at the
dri code.. so could be misunderstanding something..

BR,
-R


I think I can shed some light on the issue to justify (or not) the change.

If we don't do throttling and the CPU renders frames at a faster rate
than what the GPU can render,
the GPU can become quite late and cause huge frame lag.

The throttling involves forcing a (CPU) wait when a frame is presented
if the 'x' previous images have not finished rendering.

The lower 'x', the lower the frame lag.

However if 'x' is too low (waiting current frame is rendered for
example), the GPU can be idle until the CPU is flushing new commands.

Now there is a choice to be made for the value of 'x'. 1 or 2 are
reasonable values.

if 'x' is 1, we wait the previous frame was rendered when we present the
current frame. For '2' we wait the frame before.


Thus for smaller gpu's, a value of 1 is better than 2 as it is more
affected by the frame lag (as frames take longer to render).


I get the latency aspect.. but my comment was more about latency vs
framerate (or maybe more cynically, about games vs benchmarks :-P)

BR,
-R



As long at the GPU has work to do, performance should be maximized.

However in the case I described below, if CPU and GPU render at about 
the same framerate and
the framerate has some variations (whether it being the GPU taking more 
time for one frame, or the CPU),

using more than 1 would give a bit better performance.


Axel






However if a game is rendering at a very unstable framerate (some frames
needing more work than others), using a value of 2 is safer
to maximize performance. (As using a value of 1 would lead to wait if we
get a frame that takes particularly long, as using 2 smooths that a bit)


I remember years ago I had extremely unstable fps when using catalyst on
Portal for example. But I believe it is more a driver issue than a game
issue.

If we assume games don't have unstable framerate, (which seems
reasonable assumption), using 1 as default makes sense.


If one wants to test experimentally for regression, the ideal test case
if when the GPU renders at about the same framerate as the CPU feeds it.



Axel






___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] st/dri: decrease input lag by syncing sooner in SwapBuffers

2019-04-27 Thread Axel Davy

On 27/04/2019 18:13, Rob Clark wrote:

On Thu, Apr 25, 2019 at 7:06 PM Marek Olšák  wrote:

From: Marek Olšák 

It's done by:
- decrease the number of frames in flight by 1
- flush before throttling in SwapBuffers
   (instead of wait-then-flush, do flush-then-wait)

The improvement is apparent with Unigine Heaven.

Previously:
 draw frame 2
 wait frame 0
 flush frame 2
 present frame 2

 The input lag is 2 frames.

Now:
 draw frame 2
 flush frame 2
 wait frame 1
 present frame 2

 The input lag is 1 frame. Flushing is done before waiting, because
 otherwise the device would be idle after waiting.

Nine is affected because it also uses the pipe cap.
---
  src/gallium/auxiliary/util/u_screen.c |  2 +-
  src/gallium/state_trackers/dri/dri_drawable.c | 20 +--
  2 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/src/gallium/auxiliary/util/u_screen.c 
b/src/gallium/auxiliary/util/u_screen.c
index 27f51e0898e..410f17421e6 100644
--- a/src/gallium/auxiliary/util/u_screen.c
+++ b/src/gallium/auxiliary/util/u_screen.c
@@ -349,21 +349,21 @@ u_pipe_screen_get_param_defaults(struct pipe_screen 
*pscreen,
 case PIPE_CAP_MAX_VARYINGS:
return 8;

 case PIPE_CAP_COMPUTE_GRID_INFO_LAST_BLOCK:
return 0;

 case PIPE_CAP_COMPUTE_SHADER_DERIVATIVES:
return 0;

 case PIPE_CAP_MAX_FRAMES_IN_FLIGHT:
-  return 2;
+  return 1;

would it be safer to leave the current default and let drivers opt-in
to the lower # individually?  I guess triple buffering would still be
better for some of the smaller gpu's?

disclaimer: I haven't tested this myself or looked very closely at the
dri code.. so could be misunderstanding something..

BR,
-R



I think I can shed some light on the issue to justify (or not) the change.

If we don't do throttling and the CPU renders frames at a faster rate 
than what the GPU can render,

the GPU can become quite late and cause huge frame lag.

The throttling involves forcing a (CPU) wait when a frame is presented 
if the 'x' previous images have not finished rendering.


The lower 'x', the lower the frame lag.

However if 'x' is too low (waiting current frame is rendered for 
example), the GPU can be idle until the CPU is flushing new commands.


Now there is a choice to be made for the value of 'x'. 1 or 2 are 
reasonable values.


if 'x' is 1, we wait the previous frame was rendered when we present the 
current frame. For '2' we wait the frame before.



Thus for smaller gpu's, a value of 1 is better than 2 as it is more 
affected by the frame lag (as frames take longer to render).



However if a game is rendering at a very unstable framerate (some frames 
needing more work than others), using a value of 2 is safer
to maximize performance. (As using a value of 1 would lead to wait if we 
get a frame that takes particularly long, as using 2 smooths that a bit)



I remember years ago I had extremely unstable fps when using catalyst on 
Portal for example. But I believe it is more a driver issue than a game 
issue.


If we assume games don't have unstable framerate, (which seems 
reasonable assumption), using 1 as default makes sense.



If one wants to test experimentally for regression, the ideal test case 
if when the GPU renders at about the same framerate as the CPU feeds it.




Axel






 case PIPE_CAP_DMABUF:
  #ifdef PIPE_OS_LINUX
return 1;
  #else
return 0;
  #endif

 default:
unreachable("bad PIPE_CAP_*");
diff --git a/src/gallium/state_trackers/dri/dri_drawable.c 
b/src/gallium/state_trackers/dri/dri_drawable.c
index 26bfdbecc53..c1de3bed9dd 100644
--- a/src/gallium/state_trackers/dri/dri_drawable.c
+++ b/src/gallium/state_trackers/dri/dri_drawable.c
@@ -555,33 +555,33 @@ dri_flush(__DRIcontext *cPriv,
 *
 * This pulls a fence off the throttling queue and waits for it if the
 * number of fences on the throttling queue has reached the desired
 * number.
 *
 * Then flushes to insert a fence at the current rendering position, and
 * pushes that fence on the queue. This requires that the 
st_context_iface
 * flush method returns a fence even if there are no commands to flush.
 */
struct pipe_screen *screen = drawable->screen->base.screen;
-  struct pipe_fence_handle *fence;
+  struct pipe_fence_handle *oldest_fence, *new_fence = NULL;

-  fence = swap_fences_pop_front(drawable);
-  if (fence) {
- (void) screen->fence_finish(screen, NULL, fence, 
PIPE_TIMEOUT_INFINITE);
- screen->fence_reference(screen, , NULL);
-  }
+  st->flush(st, flush_flags, _fence);

-  st->flush(st, flush_flags, );
+  oldest_fence = swap_fences_pop_front(drawable);
+  if (oldest_fence) {
+ screen->fence_finish(screen, NULL, oldest_fence, 
PIPE_TIMEOUT_INFINITE);
+ screen->fence_reference(screen, _fence, NULL);
+  }

-  

Re: [Mesa-dev] [PATCH] st/dri: decrease input lag by syncing sooner in SwapBuffers

2019-04-27 Thread Axel Davy

On 26/04/2019 20:40, Marek Olšák wrote:
On Fri, Apr 26, 2019 at 12:56 PM Axel Davy <mailto:davyax...@gmail.com>> wrote:


On 26/04/2019 10:08, Michel Dänzer wrote:
> On 2019-04-26 4:06 a.m., Marek Olšák wrote:
>> From: Marek Olšák mailto:marek.ol...@amd.com>>
>>
>> It's done by:
>> - decrease the number of frames in flight by 1
>> - flush before throttling in SwapBuffers
>>    (instead of wait-then-flush, do flush-then-wait)
>>
>> The improvement is apparent with Unigine Heaven.
>>
>> Previously:
>>      draw frame 2
>>      wait frame 0
>>      flush frame 2
>>      present frame 2
>>
>>      The input lag is 2 frames.
>>
>> Now:
>>      draw frame 2
>>      flush frame 2
>>      wait frame 1
>>      present frame 2
>>
>>      The input lag is 1 frame. Flushing is done before waiting,
because
>>      otherwise the device would be idle after waiting.
> Nice idea. Not sure offhand about all ramifications, but
certainly worth
> a go.
>
>
>> Nine is affected because it also uses the pipe cap.
>> ---
>>   src/gallium/auxiliary/util/u_screen.c         | 2 +-
>>   src/gallium/state_trackers/dri/dri_drawable.c | 20
+--
>>   2 files changed, 11 insertions(+), 11 deletions(-)
>>
>> diff --git a/src/gallium/auxiliary/util/u_screen.c
b/src/gallium/auxiliary/util/u_screen.c
>> index 27f51e0898e..410f17421e6 100644
>> --- a/src/gallium/auxiliary/util/u_screen.c
>> +++ b/src/gallium/auxiliary/util/u_screen.c
>> @@ -349,21 +349,21 @@ u_pipe_screen_get_param_defaults(struct
pipe_screen *pscreen,
>>      case PIPE_CAP_MAX_VARYINGS:
>>         return 8;
>>
>>      case PIPE_CAP_COMPUTE_GRID_INFO_LAST_BLOCK:
>>         return 0;
>>
>>      case PIPE_CAP_COMPUTE_SHADER_DERIVATIVES:
>>         return 0;
>>
>>      case PIPE_CAP_MAX_FRAMES_IN_FLIGHT:
>> -      return 2;
>> +      return 1;
> This might be slightly misleading, as there can still be two
frames in
> flight (on the GPU) at the same time. Might be better to leave
this at 2
> (so Nine isn't affected) and adjust its treatment in
> src/gallium/state_trackers/dri/dri_drawable.c .
>
>
Checking what gallium nine does currently, it seems we already do
flush
then wait,
however we call swap_fences_pop_front and swap_fences_push_back in
the
reverse order compared to your patch.
We compensate by taking PIPE_CAP_MAX_FRAMES_IN_FLIGHT + 1

In conclusion, with the proposed patch, gl and nine should have
the same
behaviour (and thus if gl benefits from a value of 1, nine should
as well).
I haven't have noticed input lag, I guess I have to test on heaven if
you see a difference.
How can I slow down my gpu to test that ? I use to use the
/sys/kernel/debug/dri/0/ vars to force low dpm, but it doesn't
seem to
be possible anymore as the related files are gone (rx480) ?


I set maximum settings, windowed, resolution: custom, and I type in 
the 4K resolution (I don't have a 4K monitor). When it's running, I 
enable wireframe. It should be pretty slow.


Marek



I couldn't notice any performance difference (card at low or high dpm - 
somehow the vars were still there, I was looking at the wrong place),
with and without the change and gallium nine. I couldn't notice a change 
in lag either (slowest I had was around 20 fps, which may not be the 
best to see that).


I'm fine with this change affecting nine.


Axel

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] st/dri: decrease input lag by syncing sooner in SwapBuffers

2019-04-26 Thread Axel Davy

On 26/04/2019 10:08, Michel Dänzer wrote:

On 2019-04-26 4:06 a.m., Marek Olšák wrote:

From: Marek Olšák 

It's done by:
- decrease the number of frames in flight by 1
- flush before throttling in SwapBuffers
   (instead of wait-then-flush, do flush-then-wait)

The improvement is apparent with Unigine Heaven.

Previously:
 draw frame 2
 wait frame 0
 flush frame 2
 present frame 2

 The input lag is 2 frames.

Now:
 draw frame 2
 flush frame 2
 wait frame 1
 present frame 2

 The input lag is 1 frame. Flushing is done before waiting, because
 otherwise the device would be idle after waiting.

Nice idea. Not sure offhand about all ramifications, but certainly worth
a go.



Nine is affected because it also uses the pipe cap.
---
  src/gallium/auxiliary/util/u_screen.c |  2 +-
  src/gallium/state_trackers/dri/dri_drawable.c | 20 +--
  2 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/src/gallium/auxiliary/util/u_screen.c 
b/src/gallium/auxiliary/util/u_screen.c
index 27f51e0898e..410f17421e6 100644
--- a/src/gallium/auxiliary/util/u_screen.c
+++ b/src/gallium/auxiliary/util/u_screen.c
@@ -349,21 +349,21 @@ u_pipe_screen_get_param_defaults(struct pipe_screen 
*pscreen,
 case PIPE_CAP_MAX_VARYINGS:
return 8;
  
 case PIPE_CAP_COMPUTE_GRID_INFO_LAST_BLOCK:

return 0;
  
 case PIPE_CAP_COMPUTE_SHADER_DERIVATIVES:

return 0;
  
 case PIPE_CAP_MAX_FRAMES_IN_FLIGHT:

-  return 2;
+  return 1;

This might be slightly misleading, as there can still be two frames in
flight (on the GPU) at the same time. Might be better to leave this at 2
(so Nine isn't affected) and adjust its treatment in
src/gallium/state_trackers/dri/dri_drawable.c .


Checking what gallium nine does currently, it seems we already do flush 
then wait,
however we call swap_fences_pop_front and swap_fences_push_back in the 
reverse order compared to your patch.

We compensate by taking PIPE_CAP_MAX_FRAMES_IN_FLIGHT + 1

In conclusion, with the proposed patch, gl and nine should have the same 
behaviour (and thus if gl benefits from a value of 1, nine should as well).
I haven't have noticed input lag, I guess I have to test on heaven if 
you see a difference.
How can I slow down my gpu to test that ? I use to use the 
/sys/kernel/debug/dri/0/ vars to force low dpm, but it doesn't seem to 
be possible anymore as the related files are gone (rx480) ?



Axel

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2] st/nine: skip position checks in SetCursorPosition()

2019-04-09 Thread Axel Davy

Fine by me, but please improve the drm.h description, see below.
With that changed, this is:
Reviewed-by: Axel Davy 

Axel

On 08/04/2019 09:10, Andre Heider wrote:

For HW cursors, "cursor.pos" doesn't hold the current position of the
pointer, just the position of the last call to SetCursorPosition().

Skip the check against stale values and bump the d3dadapter9 drm version
to expose this change of behaviour.

Signed-off-by: Andre Heider 
---
V2: don't introduce SetVersion(), bump D3DADAPTER9DRM_MINOR instead

Corresponding d3d9-nine.dll patch:
https://github.com/iXit/wine-nine-standalone/commit/e09fcbbad4efd481833df1123de0cb690e1b2860

  include/d3dadapter/drm.h  | 7 +--
  src/gallium/state_trackers/nine/device9.c | 8 +---
  2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/include/d3dadapter/drm.h b/include/d3dadapter/drm.h
index 647f017fc7f..210e2395669 100644
--- a/include/d3dadapter/drm.h
+++ b/include/d3dadapter/drm.h
@@ -29,11 +29,14 @@
  #define D3DADAPTER9DRM_NAME "drm"
  /* current version */
  #define D3DADAPTER9DRM_MAJOR 0
-#define D3DADAPTER9DRM_MINOR 1
+#define D3DADAPTER9DRM_MINOR 2
  
  /* version 0.0: Initial release

   * 0.1: All IDirect3D objects can be assumed to have a pointer to the
- *  internal vtable in second position of the structure */
+ *  internal vtable in second position of the structure
+ * 0.2: IDirect3DDevice9_SetCursorPosition doesn't check the cursor
+ *  position anymore
+ */


This is ambiguous (we could be checking bounds, or whatever). I think 
being more explicit should be preferred.


For example: doesn't filter out redundant cursor position settings anymore.

or: all IDirect3DDevice9_SetCursorPosition call with hardware cursor 
results in a call to ID3DPresent_SetCursorPos.



  
  struct D3DAdapter9DRM

  {
diff --git a/src/gallium/state_trackers/nine/device9.c 
b/src/gallium/state_trackers/nine/device9.c
index db1c3a1d23d..0b1fe59cb70 100644
--- a/src/gallium/state_trackers/nine/device9.c
+++ b/src/gallium/state_trackers/nine/device9.c
@@ -793,9 +793,11 @@ NineDevice9_SetCursorPosition( struct NineDevice9 *This,
  
  DBG("This=%p X=%d Y=%d Flags=%d\n", This, X, Y, Flags);
  
-if (This->cursor.pos.x == X &&

-This->cursor.pos.y == Y)
-return;
+/* present >= v1.4 handles this itself */
+if (This->minor_version_num < 4) {
+if (This->cursor.pos.x == X && This->cursor.pos.y == Y)
+return;
+}
  
  This->cursor.pos.x = X;

  This->cursor.pos.y = Y;



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] st/nine: introduce SetVersion() for present interface v1.4

2019-04-06 Thread Axel Davy
Just to give a follow-up on this patch for the mailing list, we 
discussed on irc that there was already a way to advertise mesa version 
(the d3dadapter9 version).


Axel

On 04/04/2019 12:34, Andre Heider wrote:

A follow up patch requires a behaviour change, so we need to negotiate
the present version to be used to keep backward compatiblity.

We already get the supported version from the WINE side via GetVersion().
Introduce the member SetVersion() to pass back the maximum common
version.

Signed-off-by: Andre Heider 
---

Corresponding d3d9-nine.dll patch:
https://github.com/iXit/wine-nine-standalone/commit/c7d3b86ee3dc40f897508cd13a3862c277cbe08c

  include/d3dadapter/present.h   |  3 +++
  src/gallium/state_trackers/nine/adapter9.c | 11 +++
  2 files changed, 14 insertions(+)

diff --git a/include/d3dadapter/present.h b/include/d3dadapter/present.h
index 0325ebc511f..2f784837cfb 100644
--- a/include/d3dadapter/present.h
+++ b/include/d3dadapter/present.h
@@ -151,6 +151,8 @@ typedef struct ID3DPresentGroupVtbl
  /* used to create additional presentation interfaces along the way */
  HRESULT (WINAPI *CreateAdditionalPresent)(ID3DPresentGroup *This, 
D3DPRESENT_PARAMETERS *pPresentationParameters, ID3DPresent **ppPresent);
  void (WINAPI *GetVersion) (ID3DPresentGroup *This, int *major, int 
*minor);
+/* Available since version 1.4 */
+void (WINAPI *SetVersion) (ID3DPresentGroup *This, int major, int minor);
  } ID3DPresentGroupVtbl;
  
  struct ID3DPresentGroup

@@ -167,6 +169,7 @@ struct ID3DPresentGroup
  #define ID3DPresentGroup_GetPresent(p,a,b) (p)->lpVtbl->GetPresent(p,a,b)
  #define ID3DPresentGroup_CreateAdditionalPresent(p,a,b) 
(p)->lpVtbl->CreateAdditionalPresent(p,a,b)
  #define ID3DPresentGroup_GetVersion(p,a,b) (p)->lpVtbl->GetVersion(p,a,b)
+#define ID3DPresentGroup_SetVersion(p,a,b) (p)->lpVtbl->SetVersion(p,a,b)
  
  #endif /* __cplusplus */
  
diff --git a/src/gallium/state_trackers/nine/adapter9.c b/src/gallium/state_trackers/nine/adapter9.c

index 3aa95b93b2f..4f648e894b8 100644
--- a/src/gallium/state_trackers/nine/adapter9.c
+++ b/src/gallium/state_trackers/nine/adapter9.c
@@ -34,6 +34,9 @@
  
  #define DBG_CHANNEL DBG_ADAPTER
  
+/* The maximum supported present version */

+#define MAX_PRESENT_VERSION_MINOR 4
+
  HRESULT
  NineAdapter9_ctor( struct NineAdapter9 *This,
 struct NineUnknownParams *pParams,
@@ -999,6 +1002,14 @@ NineAdapter9_CreateDevice( struct NineAdapter9 *This,
  return D3DERR_NOTAVAILABLE;
  }
  
+if (minor >= 4) {

+/* d3d9-nine.dll might support a higher present version than we do.
+ * Limit it to our supported version to keep expected behaviour.
+ */
+minor = MIN2(minor, MAX_PRESENT_VERSION_MINOR);
+ID3DPresentGroup_SetVersion(pPresentationGroup, major, minor);
+}
+
  hr = NineAdapter9_GetScreen(This, DeviceType, );
  if (FAILED(hr)) {
  DBG("Failed to get pipe_screen.\n");



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] st/nine: enable csmt per default on iris

2019-03-20 Thread Axel Davy

On 20/03/2019 21:38, Andre Heider wrote:

iris is thread safe, enable csmt for a ~5% performace boost.

Signed-off-by: Andre Heider 
---
  src/gallium/state_trackers/nine/device9.c | 8 +---
  1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/src/gallium/state_trackers/nine/device9.c 
b/src/gallium/state_trackers/nine/device9.c
index 24c8ce062b3..db1c3a1d23d 100644
--- a/src/gallium/state_trackers/nine/device9.c
+++ b/src/gallium/state_trackers/nine/device9.c
@@ -266,13 +266,15 @@ NineDevice9_ctor( struct NineDevice9 *This,
  }
  
  /* Initialize CSMT */

+/* r600, radeonsi and iris are thread safe. */
  if (pCTX->csmt_force == 1)
  This->csmt_active = true;
  else if (pCTX->csmt_force == 0)
  This->csmt_active = false;
-else
-/* r600 and radeonsi are thread safe. */
-This->csmt_active = strstr(pScreen->get_name(pScreen), "AMD") != NULL;
+else if (strstr(pScreen->get_name(pScreen), "AMD") != NULL)
+This->csmt_active = true;
+else if (strstr(pScreen->get_name(pScreen), "Intel") != NULL)
+This->csmt_active = true;
  
  /* We rely on u_upload_mgr using persistent coherent buffers (which don't

   * require flush to work in multi-pipe_context scenario) for vertex and



Could have been an || inside the same if, but maybe it is easier to read 
that way.



Reviewed-by: Axel Davy 


Axel

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gallium/docs: clarify set_sampler_views

2019-03-15 Thread Axel Davy

On 15/03/2019 13:12, Rob Clark wrote:

On Fri, Mar 15, 2019 at 3:49 AM Axel Davy  wrote:

On 15/03/2019 03:12, Rob Clark wrote:

On Thu, Mar 14, 2019 at 9:58 PM Roland Scheidegger  wrote:

Am 15.03.19 um 02:18 schrieb Rob Clark:

On Thu, Mar 14, 2019 at 8:28 PM Roland Scheidegger  wrote:

Am 14.03.19 um 22:06 schrieb Rob Clark:

On Thu, Mar 14, 2019 at 3:58 PM Roland Scheidegger  wrote:

Am 14.03.19 um 14:13 schrieb Rob Clark:

On Tue, Mar 12, 2019 at 1:59 PM Roland Scheidegger  wrote:

Am 12.03.19 um 16:16 schrieb Rob Clark:

This previously was not called out clearly, but based on a survey of the
code, it seems the expected behavior is to release the reference to any
sampler views beyond the new range being bound.

That isn't really true. This was designed to work like d3d10, where
other views are unmodified.
The cso code will actually unset all views which previously were set and
are above the num_views in the call (this wouldn't be necessary if the
pipe function itself would work like this).
However, it will only do this for fragment textures, and pass through
the parameters unmodified otherwise. Which means behavior might not be
very consistent for the different stages...

Any opinion about whether views==NULL should be allowed?  Currently I
have something like:


diff --git a/src/gallium/docs/source/context.rst
b/src/gallium/docs/source/context.rst
index f89d9e1005e..06d30bfb38b 100644
--- a/src/gallium/docs/source/context.rst
+++ b/src/gallium/docs/source/context.rst
@@ -143,6 +143,11 @@ to the array index which is used for sampling.
 to a respective sampler view and releases a reference to the previous
 sampler view.

+  Sampler views outside of ``[start_slot, start_slot + num_views)`` are
+  unmodified.  If ``views`` is NULL, the behavior is the same as if
+  ``views[n]`` was NULL for the entire range, ie. releasing the reference
+  for all the sampler views in the specified range.
+
   * ``create_sampler_view`` creates a new sampler view. ``texture`` is 
associated
 with the sampler view which results in sampler view holding a reference
 to the texture. Format specified in template must be compatible


But going thru the other drivers, a lot of them also don't handle the
views==NULL case.  This case doesn't seem to come up with mesa/st, but
does with XA and nine, and some of the test code.

I think this should be illegal. As you've noted some drivers can't
handle it, and I don't see a particularly good reason to allow it. Well
I guess it trades some complexity in state trackers with some complexity
in drivers...

fwiw, going with the idea that it should be legal, I fixed that in the
drivers that didn't handle it in:

https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fmesa%2Fmesa%2Fmerge_requests%2F449data=02%7C01%7Csroland%40vmware.com%7C2fe81dea2d9d4de1974f08d6a8e42caa%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C636882095286989477sdata=qd1z5iv8dvt2z16ZlT2OPngoDGofvCM%2F%2F0hsddqAbO4%3Dreserved=0

(planning to send to list, I just pushed a WIP MR to run it thru the CI system)

I'm pretty sure both softpipe and llvmpipe would crash too, they
dereference this without checking if it's null.
So effectively all drivers but one thought it was illegal?
I still see no point in allowing it (or rather, changing this to be
allowed - per se there's nothing really wrong with this to be allowed).
That said, it appears that set_shader_images and set_shader_buffers
allow it, so there's some precedence for this.

hmm, I'd assumed llvmpipe was used with xa somewhere so I didn't
really look at it and assumed it handled this..

xa only sets fragment sampler views, and those only through cso.
cso will turn this into a non-null views parameter.
(cso itself also won't tolerate null views parameter, unless the count
is zero, but that should be alright since its semantics are that it will
unbind all views above the count - well for fragment sampler views...)
nine also sets vertex sampler views through cso, which will get passed
through to drivers as-is. However, the NULL views used there is always
accompanied by a 0 count, so for drivers interpreting things as range to
change rather than unbind things outside it is a natural no-op, and
they'll never even look at views in their loop. (Of course, that's not
quite what nine actually wanted to do...)
And yes things are very inconsistent when passed through cso (for
drivers interpreting it as range to change), since cso will unbind the
views above count for fragment shader views explicitly, but won't do
anything for any other shader stage...




but as you mentioned below, if set_shader_buffers and
set_shader_images allow null, for consistency (and since I'm already
fixing up a bunch of set_shader_buffer implementations, so handling
the ==NULL case isn't a big deal), I'd lean towards allowing NULL.  I
guess if we are going to do API cleanup, then consistency is a useful
thing.. I can check llvmpipe

Re: [Mesa-dev] [PATCH] gallium/docs: clarify set_sampler_views

2019-03-15 Thread Axel Davy

On 15/03/2019 03:12, Rob Clark wrote:

On Thu, Mar 14, 2019 at 9:58 PM Roland Scheidegger  wrote:

Am 15.03.19 um 02:18 schrieb Rob Clark:

On Thu, Mar 14, 2019 at 8:28 PM Roland Scheidegger  wrote:

Am 14.03.19 um 22:06 schrieb Rob Clark:

On Thu, Mar 14, 2019 at 3:58 PM Roland Scheidegger  wrote:

Am 14.03.19 um 14:13 schrieb Rob Clark:

On Tue, Mar 12, 2019 at 1:59 PM Roland Scheidegger  wrote:

Am 12.03.19 um 16:16 schrieb Rob Clark:

This previously was not called out clearly, but based on a survey of the
code, it seems the expected behavior is to release the reference to any
sampler views beyond the new range being bound.

That isn't really true. This was designed to work like d3d10, where
other views are unmodified.
The cso code will actually unset all views which previously were set and
are above the num_views in the call (this wouldn't be necessary if the
pipe function itself would work like this).
However, it will only do this for fragment textures, and pass through
the parameters unmodified otherwise. Which means behavior might not be
very consistent for the different stages...

Any opinion about whether views==NULL should be allowed?  Currently I
have something like:


diff --git a/src/gallium/docs/source/context.rst
b/src/gallium/docs/source/context.rst
index f89d9e1005e..06d30bfb38b 100644
--- a/src/gallium/docs/source/context.rst
+++ b/src/gallium/docs/source/context.rst
@@ -143,6 +143,11 @@ to the array index which is used for sampling.
to a respective sampler view and releases a reference to the previous
sampler view.

+  Sampler views outside of ``[start_slot, start_slot + num_views)`` are
+  unmodified.  If ``views`` is NULL, the behavior is the same as if
+  ``views[n]`` was NULL for the entire range, ie. releasing the reference
+  for all the sampler views in the specified range.
+
  * ``create_sampler_view`` creates a new sampler view. ``texture`` is 
associated
with the sampler view which results in sampler view holding a reference
to the texture. Format specified in template must be compatible


But going thru the other drivers, a lot of them also don't handle the
views==NULL case.  This case doesn't seem to come up with mesa/st, but
does with XA and nine, and some of the test code.

I think this should be illegal. As you've noted some drivers can't
handle it, and I don't see a particularly good reason to allow it. Well
I guess it trades some complexity in state trackers with some complexity
in drivers...

fwiw, going with the idea that it should be legal, I fixed that in the
drivers that didn't handle it in:

https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fmesa%2Fmesa%2Fmerge_requests%2F449data=02%7C01%7Csroland%40vmware.com%7C2fe81dea2d9d4de1974f08d6a8e42caa%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C636882095286989477sdata=qd1z5iv8dvt2z16ZlT2OPngoDGofvCM%2F%2F0hsddqAbO4%3Dreserved=0

(planning to send to list, I just pushed a WIP MR to run it thru the CI system)

I'm pretty sure both softpipe and llvmpipe would crash too, they
dereference this without checking if it's null.
So effectively all drivers but one thought it was illegal?
I still see no point in allowing it (or rather, changing this to be
allowed - per se there's nothing really wrong with this to be allowed).
That said, it appears that set_shader_images and set_shader_buffers
allow it, so there's some precedence for this.

hmm, I'd assumed llvmpipe was used with xa somewhere so I didn't
really look at it and assumed it handled this..

xa only sets fragment sampler views, and those only through cso.
cso will turn this into a non-null views parameter.
(cso itself also won't tolerate null views parameter, unless the count
is zero, but that should be alright since its semantics are that it will
unbind all views above the count - well for fragment sampler views...)
nine also sets vertex sampler views through cso, which will get passed
through to drivers as-is. However, the NULL views used there is always
accompanied by a 0 count, so for drivers interpreting things as range to
change rather than unbind things outside it is a natural no-op, and
they'll never even look at views in their loop. (Of course, that's not
quite what nine actually wanted to do...)
And yes things are very inconsistent when passed through cso (for
drivers interpreting it as range to change), since cso will unbind the
views above count for fragment shader views explicitly, but won't do
anything for any other shader stage...




but as you mentioned below, if set_shader_buffers and
set_shader_images allow null, for consistency (and since I'm already
fixing up a bunch of set_shader_buffer implementations, so handling
the ==NULL case isn't a big deal), I'd lean towards allowing NULL.  I
guess if we are going to do API cleanup, then consistency is a useful
thing.. I can check llvmpipe and softpipe and add patches to fix them
if needed.

Yes consistency is a nice goal. I'm just not sure 

Re: [Mesa-dev] [PATCH 2/5] d3dadapter9: Support software renderer on any DRI device

2019-03-09 Thread Axel Davy

After pushing this (with my r-b), gitlab's travis complained.

It seems there needs to be some ifdefs to check if kms was built.

I reverted the patch until it is fixed.

Axel

On 07/03/2019 23:23, Axel Davy wrote:

From: Patrick Rudolph 

If D3D_ALWAYS_SOFTWARE is set for debugging purposes,
run on any DRI enabled platform.
Instead of probing for a compatible gallium driver (which might
fail if there's none) always use the KMS DRI software renderer.

Allows to run nine on i915 when D3D_ALWAYS_SOFTWARE=1.

Signed-off-by: Patrick Rudolph 
---
  src/gallium/targets/d3dadapter9/drm.c | 28 +++
  1 file changed, 20 insertions(+), 8 deletions(-)

diff --git a/src/gallium/targets/d3dadapter9/drm.c 
b/src/gallium/targets/d3dadapter9/drm.c
index 1d01d4a067c..28dbd2ef9db 100644
--- a/src/gallium/targets/d3dadapter9/drm.c
+++ b/src/gallium/targets/d3dadapter9/drm.c
@@ -205,6 +205,7 @@ drm_create_adapter( int fd,
  struct d3dadapter9drm_context *ctx = 
CALLOC_STRUCT(d3dadapter9drm_context);
  HRESULT hr;
  bool different_device;
+bool software_device;
  const struct drm_conf_ret *throttle_ret = NULL;
  const struct drm_conf_ret *dmabuf_ret = NULL;
  driOptionCache defaultInitOptions;
@@ -222,7 +223,11 @@ drm_create_adapter( int fd,
  ctx->fd = fd;
  ctx->base.linear_framebuffer = different_device;
  
-if (!pipe_loader_drm_probe_fd(>dev, fd)) {

+const char *force_sw = getenv("D3D_ALWAYS_SOFTWARE");
+software_device = force_sw && !strcmp(force_sw, "1");
+
+if ((software_device && !pipe_loader_sw_probe_kms(>dev, fd)) ||
+(!software_device && !pipe_loader_drm_probe_fd(>dev, fd))) {
  ERR("Failed to probe drm fd %d.\n", fd);
  FREE(ctx);
  close(fd);
@@ -236,13 +241,20 @@ drm_create_adapter( int fd,
  return D3DERR_DRIVERINTERNALERROR;
  }
  
-dmabuf_ret = pipe_loader_configuration(ctx->dev, DRM_CONF_SHARE_FD);

-throttle_ret = pipe_loader_configuration(ctx->dev, DRM_CONF_THROTTLE);
-if (!dmabuf_ret || !dmabuf_ret->val.val_bool) {
-ERR("The driver is not capable of dma-buf sharing."
-"Abandon to load nine state tracker\n");
-drm_destroy(>base);
-return D3DERR_DRIVERINTERNALERROR;
+if (!software_device) {
+/*
+ * The software renderer isn't a DRM device and doesn't support
+ * pipe_loader_configuration.
+ * The KMS winsys supports SHARE_FD, so skip this check.
+ */
+dmabuf_ret = pipe_loader_configuration(ctx->dev, DRM_CONF_SHARE_FD);
+throttle_ret = pipe_loader_configuration(ctx->dev, DRM_CONF_THROTTLE);
+if (!dmabuf_ret || !dmabuf_ret->val.val_bool) {
+ERR("The driver is not capable of dma-buf sharing."
+"Abandon to load nine state tracker\n");
+drm_destroy(>base);
+return D3DERR_DRIVERINTERNALERROR;
+}
  }
  
  if (throttle_ret && throttle_ret->val.val_int != -1) {



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/5] st/nine: Disable depth write when nothing gets updated

2019-03-09 Thread Axel Davy

On 08/03/2019 07:26, Kenneth Graunke wrote:

On Thursday, March 7, 2019 2:23:53 PM PST Axel Davy wrote:
I don't think we actually need the NEVER check, but it seems fine.

Patches 1 and 3 are:
Reviewed-by: Kenneth Graunke 

I'm not really up to speed to review the others.


Thanks, I got a review for the other patches from Patrick.


Axel

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 4/5] st/nine: Do not advertise CANMANAGERESOURCE

2019-03-07 Thread Axel Davy
It doesn't seem the main vendors advertise it.

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/adapter9.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/nine/adapter9.c 
b/src/gallium/state_trackers/nine/adapter9.c
index 0634d5918ce..94a5d8d2aa3 100644
--- a/src/gallium/state_trackers/nine/adapter9.c
+++ b/src/gallium/state_trackers/nine/adapter9.c
@@ -547,7 +547,7 @@ NineAdapter9_GetDeviceCaps( struct NineAdapter9 *This,
 
 pCaps->Caps = 0;
 
-pCaps->Caps2 = D3DCAPS2_CANMANAGERESOURCE |
+pCaps->Caps2 = /* D3DCAPS2_CANMANAGERESOURCE | */
 /* D3DCAPS2_CANSHARERESOURCE | */
 /* D3DCAPS2_CANCALIBRATEGAMMA | */
D3DCAPS2_DYNAMICTEXTURES |
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 5/5] st/nine: Change a few advertised caps

2019-03-07 Thread Axel Davy
Most hw on the native platform advertise these
caps this way.

D3DCAPS_READ_SCANLINE: We don't really have hardware
support for that, but many games don't even check the
flag, and expect GetRasterStatus to work, which is
why we emulated it with a timer (like wine). So we
may as well advertise the cap.
D3DCURSORCAPS_LOWRES: I don't know what is the status
of this on X11, but I don't know of any dx9 game
running at height < 400 either.
D3DPTEXTURECAPS_TEXREPEATNOTSCALEDBYSIZE: The cap should
correspond to what the current generation of hw is doing.

Signed-off-by: Axel Davy 
---
 include/D3D9/d3d9caps.h| 3 +++
 src/gallium/state_trackers/nine/adapter9.c | 6 +++---
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/include/D3D9/d3d9caps.h b/include/D3D9/d3d9caps.h
index 0cce5d3f689..70f9919c53d 100644
--- a/include/D3D9/d3d9caps.h
+++ b/include/D3D9/d3d9caps.h
@@ -26,6 +26,9 @@
 #include "d3d9types.h"
 
 /* Caps flags */
+#define D3DCAPS_OVERLAY   0x0800
+#define D3DCAPS_READ_SCANLINE 0x0002
+
 #define D3DCAPS2_FULLSCREENGAMMA   0x0002
 #define D3DCAPS2_CANCALIBRATEGAMMA 0x0010
 #define D3DCAPS2_RESERVED  0x0200
diff --git a/src/gallium/state_trackers/nine/adapter9.c 
b/src/gallium/state_trackers/nine/adapter9.c
index 94a5d8d2aa3..3aa95b93b2f 100644
--- a/src/gallium/state_trackers/nine/adapter9.c
+++ b/src/gallium/state_trackers/nine/adapter9.c
@@ -545,7 +545,7 @@ NineAdapter9_GetDeviceCaps( struct NineAdapter9 *This,
 
 pCaps->AdapterOrdinal = 0;
 
-pCaps->Caps = 0;
+pCaps->Caps = D3DCAPS_READ_SCANLINE;
 
 pCaps->Caps2 = /* D3DCAPS2_CANMANAGERESOURCE | */
 /* D3DCAPS2_CANSHARERESOURCE | */
@@ -568,7 +568,7 @@ NineAdapter9_GetDeviceCaps( struct NineAdapter9 *This,
D3DPRESENT_INTERVAL_THREE |
D3DPRESENT_INTERVAL_FOUR |
D3DPRESENT_INTERVAL_IMMEDIATE;
-pCaps->CursorCaps = D3DCURSORCAPS_COLOR | D3DCURSORCAPS_LOWRES;
+pCaps->CursorCaps = D3DCURSORCAPS_COLOR /* | D3DCURSORCAPS_LOWRES*/;
 
 pCaps->DevCaps = D3DDEVCAPS_CANBLTSYSTONONLOCAL |
  D3DDEVCAPS_CANRENDERAFTERFLIP |
@@ -678,7 +678,7 @@ NineAdapter9_GetDeviceCaps( struct NineAdapter9 *This,
 D3DPTEXTURECAPS_ALPHAPALETTE |
 D3DPTEXTURECAPS_PERSPECTIVE |
 D3DPTEXTURECAPS_PROJECTED |
-/*D3DPTEXTURECAPS_TEXREPEATNOTSCALEDBYSIZE |*/
+D3DPTEXTURECAPS_TEXREPEATNOTSCALEDBYSIZE |
 D3DPTEXTURECAPS_CUBEMAP |
 D3DPTEXTURECAPS_VOLUMEMAP |
 D3DNPIPECAP(NPOT_TEXTURES, D3DPTEXTURECAPS_POW2) |
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/5] d3dadapter9: Support software renderer on any DRI device

2019-03-07 Thread Axel Davy
From: Patrick Rudolph 

If D3D_ALWAYS_SOFTWARE is set for debugging purposes,
run on any DRI enabled platform.
Instead of probing for a compatible gallium driver (which might
fail if there's none) always use the KMS DRI software renderer.

Allows to run nine on i915 when D3D_ALWAYS_SOFTWARE=1.

Signed-off-by: Patrick Rudolph 
---
 src/gallium/targets/d3dadapter9/drm.c | 28 +++
 1 file changed, 20 insertions(+), 8 deletions(-)

diff --git a/src/gallium/targets/d3dadapter9/drm.c 
b/src/gallium/targets/d3dadapter9/drm.c
index 1d01d4a067c..28dbd2ef9db 100644
--- a/src/gallium/targets/d3dadapter9/drm.c
+++ b/src/gallium/targets/d3dadapter9/drm.c
@@ -205,6 +205,7 @@ drm_create_adapter( int fd,
 struct d3dadapter9drm_context *ctx = CALLOC_STRUCT(d3dadapter9drm_context);
 HRESULT hr;
 bool different_device;
+bool software_device;
 const struct drm_conf_ret *throttle_ret = NULL;
 const struct drm_conf_ret *dmabuf_ret = NULL;
 driOptionCache defaultInitOptions;
@@ -222,7 +223,11 @@ drm_create_adapter( int fd,
 ctx->fd = fd;
 ctx->base.linear_framebuffer = different_device;
 
-if (!pipe_loader_drm_probe_fd(>dev, fd)) {
+const char *force_sw = getenv("D3D_ALWAYS_SOFTWARE");
+software_device = force_sw && !strcmp(force_sw, "1");
+
+if ((software_device && !pipe_loader_sw_probe_kms(>dev, fd)) ||
+(!software_device && !pipe_loader_drm_probe_fd(>dev, fd))) {
 ERR("Failed to probe drm fd %d.\n", fd);
 FREE(ctx);
 close(fd);
@@ -236,13 +241,20 @@ drm_create_adapter( int fd,
 return D3DERR_DRIVERINTERNALERROR;
 }
 
-dmabuf_ret = pipe_loader_configuration(ctx->dev, DRM_CONF_SHARE_FD);
-throttle_ret = pipe_loader_configuration(ctx->dev, DRM_CONF_THROTTLE);
-if (!dmabuf_ret || !dmabuf_ret->val.val_bool) {
-ERR("The driver is not capable of dma-buf sharing."
-"Abandon to load nine state tracker\n");
-drm_destroy(>base);
-return D3DERR_DRIVERINTERNALERROR;
+if (!software_device) {
+/*
+ * The software renderer isn't a DRM device and doesn't support
+ * pipe_loader_configuration.
+ * The KMS winsys supports SHARE_FD, so skip this check.
+ */
+dmabuf_ret = pipe_loader_configuration(ctx->dev, DRM_CONF_SHARE_FD);
+throttle_ret = pipe_loader_configuration(ctx->dev, DRM_CONF_THROTTLE);
+if (!dmabuf_ret || !dmabuf_ret->val.val_bool) {
+ERR("The driver is not capable of dma-buf sharing."
+"Abandon to load nine state tracker\n");
+drm_destroy(>base);
+return D3DERR_DRIVERINTERNALERROR;
+}
 }
 
 if (throttle_ret && throttle_ret->val.val_int != -1) {
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/5] st/nine: Disable depth write when nothing gets updated

2019-03-07 Thread Axel Davy
I do not see any perf impact on radeonsi, but it
seems iris needs this.
It seems something sensible to do.

Signed-off-by: Axel Davy 
Reviewed-by: Timur Kristóf 
Tested-by: Andre Heider 
---
It may be argued this kind of stuff should be done in the driver.
I don't mind either way. The ogl state tracker already does that
optimization.
 src/gallium/state_trackers/nine/nine_pipe.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/nine/nine_pipe.c 
b/src/gallium/state_trackers/nine/nine_pipe.c
index a84a17f551f..b69ddb67113 100644
--- a/src/gallium/state_trackers/nine/nine_pipe.c
+++ b/src/gallium/state_trackers/nine/nine_pipe.c
@@ -36,8 +36,11 @@ nine_convert_dsa_state(struct pipe_depth_stencil_alpha_state 
*dsa_state,
 
 if (rs[D3DRS_ZENABLE]) {
 dsa.depth.enabled = 1;
-dsa.depth.writemask = !!rs[D3DRS_ZWRITEENABLE];
 dsa.depth.func = d3dcmpfunc_to_pipe_func(rs[D3DRS_ZFUNC]);
+/* Disable depth write if no change can occur */
+dsa.depth.writemask = !!rs[D3DRS_ZWRITEENABLE] &&
+dsa.depth.func != PIPE_FUNC_EQUAL &&
+dsa.depth.func != PIPE_FUNC_NEVER;
 }
 
 if (rs[D3DRS_STENCILENABLE]) {
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/5] st/nine: Do not advertise support for D15S1 and D24X4S4

2019-03-07 Thread Axel Davy
The former is supported on Matrox cards but no other hw.
The latter isn't supported anywhere.

It is fine to not advertise them as supported,
and it could prevent apps to trigger weird rendering paths.

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/nine_pipe.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/state_trackers/nine/nine_pipe.c 
b/src/gallium/state_trackers/nine/nine_pipe.c
index b69ddb67113..dd858aef743 100644
--- a/src/gallium/state_trackers/nine/nine_pipe.c
+++ b/src/gallium/state_trackers/nine/nine_pipe.c
@@ -286,10 +286,10 @@ const enum pipe_format nine_d3d9_to_pipe_format_map[120] =
[D3DFMT_A2W10V10U10]   = PIPE_FORMAT_R10SG10SB10SA2U_NORM,
[D3DFMT_D16_LOCKABLE]  = PIPE_FORMAT_Z16_UNORM,
[D3DFMT_D32]   = PIPE_FORMAT_Z32_UNORM,
-   [D3DFMT_D15S1] = PIPE_FORMAT_Z24_UNORM_S8_UINT,
+   [D3DFMT_D15S1] = PIPE_FORMAT_NONE,
[D3DFMT_D24S8] = PIPE_FORMAT_S8_UINT_Z24_UNORM,
[D3DFMT_D24X8] = PIPE_FORMAT_X8Z24_UNORM,
-   [D3DFMT_D24X4S4]   = PIPE_FORMAT_Z24_UNORM_S8_UINT,
+   [D3DFMT_D24X4S4]   = PIPE_FORMAT_NONE,
[D3DFMT_D16]   = PIPE_FORMAT_Z16_UNORM,
[D3DFMT_D32F_LOCKABLE] = PIPE_FORMAT_Z32_FLOAT,
[D3DFMT_D24FS8]= PIPE_FORMAT_Z32_FLOAT_S8X24_UINT,
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Mesa-stable] [PATCH] st/nine: Immediately upload user provided textures

2019-03-06 Thread Axel Davy

On 06/03/2019 20:13, Dylan Baker wrote:

Quoting Axel Davy (2019-01-22 12:08:05)

Fixes regression caused by
42d672fa6a766363e5703f119607f7c7975918aa
st/nine: Bind src not dst in nine_context_box_upload

Before that patch, for user provided textures,
when the texture was destroyed, the safety
check for pending uploads, which according to
the code "Following condition cannot happen currently",
was flushing the queue and thus triggering the upload.

After the patch, the texture destruction was delayed after
the upload. However the user frees the texture buffer,
as it thinks the texture released.

Instead of reverting the faulty patch,
this patch instead flushes the csmt queue right away
after queuing the upload for this type of textures.
This is more future-proof, as we may want to bind the
surface for other reasons in the future.

Signed-off-by: Axel Davy 
Cc: 18.3 
---
The regression affects Mesa 18.3.2. At least HL2 lost coast
is affected, has artifacts and crashes at the menu.

  src/gallium/state_trackers/nine/surface9.c | 13 +
  1 file changed, 13 insertions(+)

diff --git a/src/gallium/state_trackers/nine/surface9.c 
b/src/gallium/state_trackers/nine/surface9.c
index f94f7c62583..7f4ecf559e8 100644
--- a/src/gallium/state_trackers/nine/surface9.c
+++ b/src/gallium/state_trackers/nine/surface9.c
@@ -668,6 +668,19 @@ NineSurface9_CopyMemToDefault( struct NineSurface9 *This,
  From->data, From->stride,
  0, /* depth = 1 */
  _box);
+if (From->texture == D3DRTYPE_TEXTURE) {
+struct NineTexture9 *tex =
+NineTexture9(From->base.base.container);
+/* D3DPOOL_SYSTEMMEM with buffer content passed
+ * from the user: execute the upload right now.
+ * It is possible it is enough to delay upload
+ * until the surface refcount is 0, but the
+ * bind refcount may not be 0, and thus the dtor
+ * is not executed (and doesn't trigger the
+ * pending_uploads_counter check). */
+if (!tex->managed_buffer)
+nine_csmt_process(This->base.base.device);
+}
  
  if (This->data_conversion)

  (void) util_format_translate(This->format_conversion,
--
2.20.1


Should I pick this to 19.0 as well after it lands?

Dylan


Well, it looks like to be already in the 19.0 tree:
https://cgit.freedesktop.org/mesa/mesa/commit/?h=19.0=d7433c22e6c9624ca5275a3cd35be79caed9fffc


Yours,


Axel Davy

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Mesa-stable] [PATCH 1/2] st/nine: Ignore window size if error

2019-02-28 Thread Axel Davy

On 28/02/2019 12:54, Emil Velikov wrote:

On Wed, 27 Feb 2019 at 22:49, Axel Davy  wrote:

Check GetWindowInfo and ignore the computed sizes
if there is an error.

Fixes the regression caused by:
commit 2318ca68bbeb4fa6e21a4d8c650cec3f64246596
"st/nine: Handle window resize when a presentation buffer is used"
when using old wine gallium nine patches

Related issues:
https://github.com/iXit/Mesa-3D/issues/331
https://github.com/iXit/Mesa-3D/issues/332

Fixes also crash at window destruction.

Cc: mesa-sta...@lists.freedesktop.org

Signed-off-by: Axel Davy 
---

Nittiest of nits: the following takes 1/3 the cognitive effort.




Nittiest of nits do matter. Thank you for the suggestion, I shall take 
replace the commit message with your suggestion !





Check GetWindowInfo and ignore the computed sizes if there is an error.

Fixes a regression caused by earlier commit when using old wine gallium
nine patches.

Should also address a crash at window destruction.

Related issues:
  https://github.com/iXit/Mesa-3D/issues/331
  https://github.com/iXit/Mesa-3D/issues/332

Cc: mesa-sta...@lists.freedesktop.org
Fixes: 2318ca68bbe ("st/nine: Handle window resize when a presentation
buffer is used")
Signed-off-by: Axel Davy 


HTH
-Emil



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] st/nine: Ignore multisample quality level if no ms

2019-02-27 Thread Axel Davy
Apparently instead of returning error when passing
a quality level different than 0 for
D3DMULTISAMPLE_NONE, we should pass.

Fixes: https://github.com/iXit/Mesa-3D/issues/340

Cc: mesa-sta...@lists.freedesktop.org

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/nine_pipe.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/gallium/state_trackers/nine/nine_pipe.h 
b/src/gallium/state_trackers/nine/nine_pipe.h
index 7b68c09c47a..0595da5535a 100644
--- a/src/gallium/state_trackers/nine/nine_pipe.h
+++ b/src/gallium/state_trackers/nine/nine_pipe.h
@@ -377,6 +377,10 @@ d3dmultisample_type_check(struct pipe_screen *screen,
 if (levels)
 *levels = 1;
 
+/* Ignores multisamplequality */
+if (*multisample == D3DMULTISAMPLE_NONE)
+return D3D_OK;
+
 if (*multisample == D3DMULTISAMPLE_NONMASKABLE) {
 if (depth_stencil_format(format))
 bind = d3d9_get_pipe_depth_format_bindings(format);
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] st/nine: Ignore window size if error

2019-02-27 Thread Axel Davy
Check GetWindowInfo and ignore the computed sizes
if there is an error.

Fixes the regression caused by:
commit 2318ca68bbeb4fa6e21a4d8c650cec3f64246596
"st/nine: Handle window resize when a presentation buffer is used"
when using old wine gallium nine patches

Related issues:
https://github.com/iXit/Mesa-3D/issues/331
https://github.com/iXit/Mesa-3D/issues/332

Fixes also crash at window destruction.

Cc: mesa-sta...@lists.freedesktop.org

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/swapchain9.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/nine/swapchain9.c 
b/src/gallium/state_trackers/nine/swapchain9.c
index 6c22be24c7c..36e07310400 100644
--- a/src/gallium/state_trackers/nine/swapchain9.c
+++ b/src/gallium/state_trackers/nine/swapchain9.c
@@ -750,9 +750,16 @@ present( struct NineSwapChain9 *This,
 if (This->params.SwapEffect == D3DSWAPEFFECT_DISCARD)
 handle_draw_cursor_and_hud(This, resource);
 
-ID3DPresent_GetWindowInfo(This->present, hDestWindowOverride, 
_width, _height, _depth);
+hr = ID3DPresent_GetWindowInfo(This->present, hDestWindowOverride, 
_width, _height, _depth);
 (void)target_depth;
 
+/* Can happen with old Wine (presentation can still succeed),
+ * or at window destruction. */
+if (FAILED(hr) || target_width == 0 || target_height == 0) {
+target_width = resource->width0;
+target_height = resource->height0;
+}
+
 /* Switch to using presentation buffers on window resize.
  * Note: Most apps should resize the d3d back buffers when
  * a window resize is detected, which will result in a call to
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 00/26] RadeonSI: Primitive culling with async compute

2019-02-13 Thread Axel Davy

On 13/02/2019 17:42, Marek Olšák wrote:
On Wed, Feb 13, 2019 at 2:28 AM Axel Davy <mailto:davyax...@gmail.com>> wrote:


On 13/02/2019 06:15, Marek Olšák wrote:
> I decided to enable this optimization on all Pro graphics cards.
> The reason is that I haven't had time to benchmark games.
> This decision may be changed based on community feedback, etc.


Could the decision to run the optimization be based on some perf
counters related to culling ? If enough vertices are culled, you'd
enable the optimization.


No, that's not possible. When I enable this, all gfx counters and 
pipeline statistics report that (almost) no primitives are culled, 
because the compute shader culls them before the gfx pipeline.


You would disable by default the optimization. The perf counters would 
then be meaningful. If the perf counter tells you enough primitives are 
culled, you'd switch to the optimization and would stop looking at the 
counters. No need to enable if only a few things are culled.


The best of course is that if you detect at some point the optimization 
is worth it, it won't stop being worth it in a different game scene, but 
it should be already a good filter, as if you never go above the 
threshold, you definitely don't need the optimization.





There seems to be an AMD patent on the optimization, I failed to
see it
mentioned, maybe it should be pointed out somewhere.


Unlikely. It's based on this:
https://frostbite-wp-prd.s3.amazonaws.com/wp-content/uploads/2016/03/29204330/GDC_2016_Compute.pdf

And this is pretty much a simpler version of what I implemented:
https://gpuopen.com/gaming-product/geometryfx/

Marek



This is what I found:

https://patents.google.com/patent/US20180033184A1/en

Axel

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 00/26] RadeonSI: Primitive culling with async compute

2019-02-12 Thread Axel Davy

On 13/02/2019 06:15, Marek Olšák wrote:

I decided to enable this optimization on all Pro graphics cards.
The reason is that I haven't had time to benchmark games.
This decision may be changed based on community feedback, etc.



Could the decision to run the optimization be based on some perf 
counters related to culling ? If enough vertices are culled, you'd 
enable the optimization.


There seems to be an AMD patent on the optimization, I failed to see it 
mentioned, maybe it should be pointed out somewhere.



Yours,

Axel


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 4/4] radeonsi: use SDMA for uploading data through const_uploader

2019-02-07 Thread Axel Davy

On 07/02/2019 02:22, Marek Olšák wrote:
  
+	bool use_sdma_upload = sscreen->info.has_dedicated_vram && sctx->dma_cs && debug_get_bool_option("SDMA", true);

+   sctx->b.const_uploader = u_upload_create(>b, 256 * 1024,
+0, PIPE_USAGE_DEFAULT,
+SI_RESOURCE_FLAG_32BIT |
+(use_sdma_upload ?
+ 
SI_RESOURCE_FLAG_UPLOAD_FLUSH_EXPLICIT_VIA_SDMA :
+ 
(sscreen->cpdma_prefetch_writes_memory ?
+  0 : 
SI_RESOURCE_FLAG_READ_ONLY)));
+   if (!sctx->b.const_uploader)
+   goto fail;
+
+   if (use_sdma_upload)
+   u_upload_enable_flush_explicit(sctx->b.const_uploader);
+



I see that APU are not affected by the change.

Are they affected by the issue this patch aims to fix though ? If so, 
wouldn't it make sense to switch to PIPE_USAGE_STREAM for APUs ?



Axel

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] st/nine: Immediately upload user provided textures

2019-01-22 Thread Axel Davy
Fixes regression caused by
42d672fa6a766363e5703f119607f7c7975918aa
st/nine: Bind src not dst in nine_context_box_upload

Before that patch, for user provided textures,
when the texture was destroyed, the safety
check for pending uploads, which according to
the code "Following condition cannot happen currently",
was flushing the queue and thus triggering the upload.

After the patch, the texture destruction was delayed after
the upload. However the user frees the texture buffer,
as it thinks the texture released.

Instead of reverting the faulty patch,
this patch instead flushes the csmt queue right away
after queuing the upload for this type of textures.
This is more future-proof, as we may want to bind the
surface for other reasons in the future.

Signed-off-by: Axel Davy 
Cc: 18.3 
---
The regression affects Mesa 18.3.2. At least HL2 lost coast
is affected, has artifacts and crashes at the menu.

 src/gallium/state_trackers/nine/surface9.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/src/gallium/state_trackers/nine/surface9.c 
b/src/gallium/state_trackers/nine/surface9.c
index f94f7c62583..7f4ecf559e8 100644
--- a/src/gallium/state_trackers/nine/surface9.c
+++ b/src/gallium/state_trackers/nine/surface9.c
@@ -668,6 +668,19 @@ NineSurface9_CopyMemToDefault( struct NineSurface9 *This,
 From->data, From->stride,
 0, /* depth = 1 */
 _box);
+if (From->texture == D3DRTYPE_TEXTURE) {
+struct NineTexture9 *tex =
+NineTexture9(From->base.base.container);
+/* D3DPOOL_SYSTEMMEM with buffer content passed
+ * from the user: execute the upload right now.
+ * It is possible it is enough to delay upload
+ * until the surface refcount is 0, but the
+ * bind refcount may not be 0, and thus the dtor
+ * is not executed (and doesn't trigger the
+ * pending_uploads_counter check). */
+if (!tex->managed_buffer)
+nine_csmt_process(This->base.base.device);
+}
 
 if (This->data_conversion)
 (void) util_format_translate(This->format_conversion,
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeonsi: use compute for resource_copy_region when possible

2019-01-15 Thread Axel Davy
y_image_compute_shader(struct pipe_context *ctx)
+{
+   static const char text[] =
+   "COMP\n"
+   "PROPERTY CS_FIXED_BLOCK_WIDTH 8\n"
+   "PROPERTY CS_FIXED_BLOCK_HEIGHT 8\n"
+   "PROPERTY CS_FIXED_BLOCK_DEPTH 1\n"
+   "DCL SV[0], THREAD_ID\n"
+   "DCL SV[1], BLOCK_ID\n"
+   "DCL IMAGE[0], 2D_ARRAY, PIPE_FORMAT_R32G32B32A32_FLOAT, WR\n"
+   "DCL IMAGE[1], 2D_ARRAY, PIPE_FORMAT_R32G32B32A32_FLOAT, WR\n"
+   "DCL CONST[0][0..1]\n" // 0:xyzw 1:xyzw
+   "DCL TEMP[0..4], LOCAL\n"
+   "IMM[0] UINT32 {8, 1, 0, 0}\n"
+   "MOV TEMP[0].xyz, CONST[0][0].xyzw\n"
+   "UMAD TEMP[1].xyz, SV[1].xyzz, IMM[0].xxyy, SV[0].xyzz\n"
+   "UADD TEMP[2].xyz, TEMP[1].xyzx, TEMP[0].xyzx\n"
+   "LOAD TEMP[3], IMAGE[0], TEMP[2].xyzx, 2D_ARRAY, 
PIPE_FORMAT_R32G32B32A32_FLOAT\n"
+   "MOV TEMP[4].xyz, CONST[0][1].xyzw\n"
+   "UADD TEMP[2].xyz, TEMP[1].xyzx, TEMP[4].xyzx\n"
+   "STORE IMAGE[1], TEMP[2].xyzz, TEMP[3], 2D_ARRAY, 
PIPE_FORMAT_R32G32B32A32_FLOAT\n"
+   "END\n";
+
+   struct tgsi_token tokens[1024];
+   struct pipe_compute_state state = {0};
+
+   if (!tgsi_text_translate(text, tokens, ARRAY_SIZE(tokens))) {
+   assert(false);
+   return NULL;
+   }
+
+   state.ir_type = PIPE_SHADER_IR_TGSI;
+   state.prog = tokens;
+
+   return ctx->create_compute_state(ctx, );
+}
+


Hi,

Here is my summary of my understanding of the proposal implementation 
for the copy implementation:


. Store input and output (x, y, z) offsets into a constant buffer
. (8, 8) workgroups
. Each workitem copies pixel (x+get_group_id(0)*8+get_local_id(0), 
y+get_group_id(1)*8+get_local_id(1), 
z+get_group_id(2)*8+get_local_id(2)). The pixel is RGBA.


Some questions:
. What happens when the textures do not have some components ? R32F for 
example
. I'm not familiar with using images in compute shaders, but is it ok to 
declare as ARGB32F even if the input/output data is not float ?


Some comments:

. If src_x, dstx, etcs are not multiple of (8, 8), the workgroups won't 
be aligned well with the tiling pattern. Fortunately cache should 
mitigate the loss, but if that's an important case to handle, one could 
write the shader differently to have all workgroups (except at border) 
aligned. I guess one can benchmark see if that tiling alignment matters 
much here.

. Overhead can be reduced by copying several pixels per work-item.
. If the src and dst region are perfectly aligned with the tiling 
pattern, the copy can be reduced to just moving a rectangle of memory 
(no tiling) and could be implemented with dma_copy if no conversion is 
needed or with a shader using buffers (no images), which would avoid 
using the image sampling hw which I believe can be more limiting than 
sampling a buffer when there is a lot of wavefronts. The data conversion 
can be done for no cost in the shader as it should be memory bound.
. (8, 8) is not optimal for linear tiled images (but I guess we don't 
often get to use them with resource_copy_region).



But most likely you already know all that and consider this is not worth 
complicating the code to speed up corner cases.


Yours,


Axel Davy

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2] st/nine: Enable debug info if NDEBUG is not set

2019-01-13 Thread Axel Davy
We want to have debug info as well if using
meson's debugoptimized when ndebug is off.

v2: use u_debug functions that do something
even if DEBUG is not set.

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/basetexture9.c |  6 +++---
 src/gallium/state_trackers/nine/basetexture9.h |  2 +-
 src/gallium/state_trackers/nine/nine_debug.c   | 12 ++--
 src/gallium/state_trackers/nine/nine_debug.h   | 10 +-
 src/gallium/state_trackers/nine/nine_dump.c|  4 ++--
 src/gallium/state_trackers/nine/nine_dump.h|  6 +++---
 src/gallium/state_trackers/nine/nine_ff.c  |  2 +-
 src/gallium/state_trackers/nine/nine_state.c   |  2 +-
 src/gallium/state_trackers/nine/surface9.c |  4 ++--
 src/gallium/state_trackers/nine/surface9.h |  2 +-
 src/gallium/state_trackers/nine/volume9.c  |  2 +-
 11 files changed, 26 insertions(+), 26 deletions(-)

diff --git a/src/gallium/state_trackers/nine/basetexture9.c 
b/src/gallium/state_trackers/nine/basetexture9.c
index 911eee6da20..441a0817461 100644
--- a/src/gallium/state_trackers/nine/basetexture9.c
+++ b/src/gallium/state_trackers/nine/basetexture9.c
@@ -28,7 +28,7 @@
 #include "cubetexture9.h"
 #include "volumetexture9.h"
 
-#ifdef DEBUG
+#if defined(DEBUG) || !defined(NDEBUG)
 #include "nine_pipe.h"
 #include "nine_dump.h"
 #endif
@@ -605,7 +605,7 @@ NineBaseTexture9_UnLoad( struct NineBaseTexture9 *This )
 BASETEX_REGISTER_UPDATE(This);
 }
 
-#ifdef DEBUG
+#if defined(DEBUG) || !defined(NDEBUG)
 void
 NineBaseTexture9_Dump( struct NineBaseTexture9 *This )
 {
@@ -620,4 +620,4 @@ NineBaseTexture9_Dump( struct NineBaseTexture9 *This )
 This->base.info.array_size, This->base.info.last_level,
 This->managed.lod, This->managed.lod_resident);
 }
-#endif /* DEBUG */
+#endif /* DEBUG || !NDEBUG */
diff --git a/src/gallium/state_trackers/nine/basetexture9.h 
b/src/gallium/state_trackers/nine/basetexture9.h
index 10a7cea46da..19899c65825 100644
--- a/src/gallium/state_trackers/nine/basetexture9.h
+++ b/src/gallium/state_trackers/nine/basetexture9.h
@@ -150,7 +150,7 @@ NineBindTextureToDevice( struct NineDevice9 *device,
 nine_bind(slot, tex);
 }
 
-#ifdef DEBUG
+#if defined(DEBUG) || !defined(NDEBUG)
 void
 NineBaseTexture9_Dump( struct NineBaseTexture9 *This );
 #else
diff --git a/src/gallium/state_trackers/nine/nine_debug.c 
b/src/gallium/state_trackers/nine/nine_debug.c
index 1dcbca45854..904a40fde83 100644
--- a/src/gallium/state_trackers/nine/nine_debug.c
+++ b/src/gallium/state_trackers/nine/nine_debug.c
@@ -93,18 +93,18 @@ _nine_debug_printf( unsigned long flag,
 for (func += 4; func != f; ++func) { *ptr++ = tolower(*func); }
 *ptr = '\0';
 if (tid)
-debug_printf("nine:0x%08lx:%s:%s: ", tid, klass, ++f);
+_debug_printf("nine:0x%08lx:%s:%s: ", tid, klass, ++f);
 else
-debug_printf("nine:%s:%s: ", klass, ++f);
+_debug_printf("nine:%s:%s: ", klass, ++f);
 } else if (func) {
 if (tid)
-debug_printf("nine:0x%08lx:%s ", tid, func);
+_debug_printf("nine:0x%08lx:%s ", tid, func);
 else
-debug_printf("nine:%s ", func);
+_debug_printf("nine:%s ", func);
 }
 
 va_start(ap, fmt);
-debug_vprintf(fmt, ap);
+_debug_vprintf(fmt, ap);
 va_end(ap);
 }
 }
@@ -116,5 +116,5 @@ _nine_stub( const char *file,
 {
 const char *r = strrchr(file, '/');
 if (r == NULL) { r = strrchr(file, '\\'); }
-debug_printf("nine:%s:%d: %s STUB!\n", r ? ++r : file, line, func);
+_debug_printf("nine:%s:%d: %s STUB!\n", r ? ++r : file, line, func);
 }
diff --git a/src/gallium/state_trackers/nine/nine_debug.h 
b/src/gallium/state_trackers/nine/nine_debug.h
index 841438a66f8..2bbb73ef96a 100644
--- a/src/gallium/state_trackers/nine/nine_debug.h
+++ b/src/gallium/state_trackers/nine/nine_debug.h
@@ -33,7 +33,7 @@ _nine_debug_printf( unsigned long flag,
 
 #define ERR(fmt, ...) _nine_debug_printf(DBG_ERROR, __FUNCTION__, fmt, ## 
__VA_ARGS__)
 
-#ifdef DEBUG
+#if defined(DEBUG) || !defined(NDEBUG)
 #define WARN(fmt, ...) _nine_debug_printf(DBG_WARN, __FUNCTION__, fmt, ## 
__VA_ARGS__)
 #define WARN_ONCE(fmt, ...) \
 do { \
@@ -48,7 +48,7 @@ _nine_debug_printf( unsigned long flag,
 #define WARN_ONCE(fmt, ...)
 #endif
 
-#ifdef DEBUG
+#if defined(DEBUG) || !defined(NDEBUG)
 #define DBG_FLAG(flag, fmt, ...) \
 _nine_debug_printf(flag, __FUNCTION__, fmt, ## __VA_ARGS__)
 #else
@@ -90,7 +90,7 @@ _nine_stub( const char *file,
 const char *func,
 unsigned line );
 
-#ifdef DEBUG
+#if defined(DEBUG) || !defined(NDEBUG)
 #define STUB(ret) \
 do { \
 _nine_stub(__FILE__, __FUNCTION__, __LINE__); \
@@ -104,7

Re: [Mesa-dev] Thoughts after hitting 100 merge requests?

2019-01-12 Thread Axel Davy

Hi,

I'm not sure the promise "1 mail per pull request" is working well.
For example, taking one recent pull request
https://gitlab.freedesktop.org/mesa/mesa/merge_requests/105

I didn't receive anything, nor
https://lists.freedesktop.org/archives/mesa-dev/2019-January/thread.html
yet.

I received some mails with [MR] in the title with two lines indicating 
merge requests, but that seems to be for a minority of the requests.


I guess the system is not automated right now.

I think there needs to be an automated system, and that it should look 
pretty close to what a cover-letter for a mail serie should look like, 
that is:
. The global stat diffs of the merge requests (which files are affected, 
how many modifications, etc)

. The summary of the request
. All the patch titles

I don't want to go open all merge requests in my browser to get that 
information.
So far I only went check the list of gitlab merge requests 3 times, 
whereas I go through my mails several times a day.



Yours,

Axel Davy




On 11/01/2019 17:57, Jason Ekstrand wrote:

All,

The mesa project has now hit 100 merge requests (36 are still open).  
I (and I'm sure others) would be curious to hear people's initial 
thoughts on the process.  What's working well?  What's not working?  
Is it total fail and should we go back to mailing lists?


--Jason

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/nine: Ignore null sized windows

2019-01-05 Thread Axel Davy

I drop this patch (for now).

It doesn't fully fix the issue, which is due to our wine implementation 
using the wrong window for ID3DPresent_GetWindowInfo.


The issue is fixed either with a wine patch or with a nine workaround.
The nine workaround consists in forcing the backend to use the correct 
window, which is simply passing hDestWindowOverride ? 
hDestWindowOverride : This->params.hDeviceWindow instead of 
hDestWindowOverride.


The problem about the nine patch is that maybe when hDestWindowOverride 
is not null, specific handling would be required in wine ? So far not to 
our knowledge, but that may tie our hands.


Anyone having opinion on whether I should push the workaround or 
consider the issue fixed if using an updated wine backend ?


Axel

On 03/01/2019 21:48, Axel Davy wrote:

If for some reason the window size detected
is null, just render at normal size.

Fixes the regression caused by:
commit 2318ca68bbeb4fa6e21a4d8c650cec3f64246596
"st/nine: Handle window resize when a presentation buffer is used"

Fixes: https://github.com/iXit/Mesa-3D/issues/331
Fixes: https://github.com/iXit/Mesa-3D/issues/332

Cc: mesa-sta...@lists.freedesktop.org
Signed-off-by: Axel Davy 
---
It may not apply cleanly to mesa stable. I can do another
patch for stable if needed. Basically it's the very same code
except the part "/* Switch to using presentation buffers ...*/"
would be replaced by "pipe = NineDevice9_GetPipe(This->base.device);"

  src/gallium/state_trackers/nine/swapchain9.c | 6 ++
  1 file changed, 6 insertions(+)

diff --git a/src/gallium/state_trackers/nine/swapchain9.c 
b/src/gallium/state_trackers/nine/swapchain9.c
index 6c22be24c7c..ceaa1cd848a 100644
--- a/src/gallium/state_trackers/nine/swapchain9.c
+++ b/src/gallium/state_trackers/nine/swapchain9.c
@@ -753,6 +753,12 @@ present( struct NineSwapChain9 *This,
  ID3DPresent_GetWindowInfo(This->present, hDestWindowOverride, _width, 
_height, _depth);
  (void)target_depth;
  
+/* Can happen for a few frames. */

+if (target_width == 0 || target_height == 0) {
+target_width = resource->width0;
+target_height = resource->height0;
+}
+
  /* Switch to using presentation buffers on window resize.
   * Note: Most apps should resize the d3d back buffers when
   * a window resize is detected, which will result in a call to



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] st/nine: Ignore null sized windows

2019-01-03 Thread Axel Davy
If for some reason the window size detected
is null, just render at normal size.

Fixes the regression caused by:
commit 2318ca68bbeb4fa6e21a4d8c650cec3f64246596
"st/nine: Handle window resize when a presentation buffer is used"

Fixes: https://github.com/iXit/Mesa-3D/issues/331
Fixes: https://github.com/iXit/Mesa-3D/issues/332

Cc: mesa-sta...@lists.freedesktop.org
Signed-off-by: Axel Davy 
---
It may not apply cleanly to mesa stable. I can do another
patch for stable if needed. Basically it's the very same code
except the part "/* Switch to using presentation buffers ...*/"
would be replaced by "pipe = NineDevice9_GetPipe(This->base.device);"

 src/gallium/state_trackers/nine/swapchain9.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/gallium/state_trackers/nine/swapchain9.c 
b/src/gallium/state_trackers/nine/swapchain9.c
index 6c22be24c7c..ceaa1cd848a 100644
--- a/src/gallium/state_trackers/nine/swapchain9.c
+++ b/src/gallium/state_trackers/nine/swapchain9.c
@@ -753,6 +753,12 @@ present( struct NineSwapChain9 *This,
 ID3DPresent_GetWindowInfo(This->present, hDestWindowOverride, 
_width, _height, _depth);
 (void)target_depth;
 
+/* Can happen for a few frames. */
+if (target_width == 0 || target_height == 0) {
+target_width = resource->width0;
+target_height = resource->height0;
+}
+
 /* Switch to using presentation buffers on window resize.
  * Note: Most apps should resize the d3d back buffers when
  * a window resize is detected, which will result in a call to
-- 
2.19.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] st/nine: Enable debug info if NDEBUG is not set

2018-12-20 Thread Axel Davy
We want to have debug info as well if using
meson's debugoptimized when ndebug is off.

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/basetexture9.c |  6 +++---
 src/gallium/state_trackers/nine/basetexture9.h |  2 +-
 src/gallium/state_trackers/nine/nine_debug.h   | 10 +-
 src/gallium/state_trackers/nine/nine_dump.c|  4 ++--
 src/gallium/state_trackers/nine/nine_dump.h|  6 +++---
 src/gallium/state_trackers/nine/nine_ff.c  |  2 +-
 src/gallium/state_trackers/nine/nine_state.c   |  2 +-
 src/gallium/state_trackers/nine/surface9.c |  4 ++--
 src/gallium/state_trackers/nine/surface9.h |  2 +-
 src/gallium/state_trackers/nine/volume9.c  |  2 +-
 10 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/src/gallium/state_trackers/nine/basetexture9.c 
b/src/gallium/state_trackers/nine/basetexture9.c
index 911eee6da20..441a0817461 100644
--- a/src/gallium/state_trackers/nine/basetexture9.c
+++ b/src/gallium/state_trackers/nine/basetexture9.c
@@ -28,7 +28,7 @@
 #include "cubetexture9.h"
 #include "volumetexture9.h"
 
-#ifdef DEBUG
+#if defined(DEBUG) || !defined(NDEBUG)
 #include "nine_pipe.h"
 #include "nine_dump.h"
 #endif
@@ -605,7 +605,7 @@ NineBaseTexture9_UnLoad( struct NineBaseTexture9 *This )
 BASETEX_REGISTER_UPDATE(This);
 }
 
-#ifdef DEBUG
+#if defined(DEBUG) || !defined(NDEBUG)
 void
 NineBaseTexture9_Dump( struct NineBaseTexture9 *This )
 {
@@ -620,4 +620,4 @@ NineBaseTexture9_Dump( struct NineBaseTexture9 *This )
 This->base.info.array_size, This->base.info.last_level,
 This->managed.lod, This->managed.lod_resident);
 }
-#endif /* DEBUG */
+#endif /* DEBUG || !NDEBUG */
diff --git a/src/gallium/state_trackers/nine/basetexture9.h 
b/src/gallium/state_trackers/nine/basetexture9.h
index 10a7cea46da..19899c65825 100644
--- a/src/gallium/state_trackers/nine/basetexture9.h
+++ b/src/gallium/state_trackers/nine/basetexture9.h
@@ -150,7 +150,7 @@ NineBindTextureToDevice( struct NineDevice9 *device,
 nine_bind(slot, tex);
 }
 
-#ifdef DEBUG
+#if defined(DEBUG) || !defined(NDEBUG)
 void
 NineBaseTexture9_Dump( struct NineBaseTexture9 *This );
 #else
diff --git a/src/gallium/state_trackers/nine/nine_debug.h 
b/src/gallium/state_trackers/nine/nine_debug.h
index 841438a66f8..2bbb73ef96a 100644
--- a/src/gallium/state_trackers/nine/nine_debug.h
+++ b/src/gallium/state_trackers/nine/nine_debug.h
@@ -33,7 +33,7 @@ _nine_debug_printf( unsigned long flag,
 
 #define ERR(fmt, ...) _nine_debug_printf(DBG_ERROR, __FUNCTION__, fmt, ## 
__VA_ARGS__)
 
-#ifdef DEBUG
+#if defined(DEBUG) || !defined(NDEBUG)
 #define WARN(fmt, ...) _nine_debug_printf(DBG_WARN, __FUNCTION__, fmt, ## 
__VA_ARGS__)
 #define WARN_ONCE(fmt, ...) \
 do { \
@@ -48,7 +48,7 @@ _nine_debug_printf( unsigned long flag,
 #define WARN_ONCE(fmt, ...)
 #endif
 
-#ifdef DEBUG
+#if defined(DEBUG) || !defined(NDEBUG)
 #define DBG_FLAG(flag, fmt, ...) \
 _nine_debug_printf(flag, __FUNCTION__, fmt, ## __VA_ARGS__)
 #else
@@ -90,7 +90,7 @@ _nine_stub( const char *file,
 const char *func,
 unsigned line );
 
-#ifdef DEBUG
+#if defined(DEBUG) || !defined(NDEBUG)
 #define STUB(ret) \
 do { \
 _nine_stub(__FILE__, __FUNCTION__, __LINE__); \
@@ -104,7 +104,7 @@ _nine_stub( const char *file,
  * macro is designed to be used in conditionals ala
  * if (user_error(required condition)) { assertion failed }
  * It also prints debug message if the assertion fails. */
-#ifdef DEBUG
+#if defined(DEBUG) || !defined(NDEBUG)
 #define user_error(x) \
 (!(x) ? (DBG_FLAG(DBG_USER, "User assertion failed: `%s'\n", #x), TRUE) \
   : FALSE)
@@ -112,7 +112,7 @@ _nine_stub( const char *file,
 #define user_error(x) (!(x) ? TRUE : FALSE)
 #endif
 
-#ifdef DEBUG
+#if defined(DEBUG) || !defined(NDEBUG)
 #define user_warn(x) \
 if ((x)) { DBG_FLAG(DBG_USER, "User warning: `%s'\n", #x); }
 #else
diff --git a/src/gallium/state_trackers/nine/nine_dump.c 
b/src/gallium/state_trackers/nine/nine_dump.c
index 1ca550586e4..85ee266defb 100644
--- a/src/gallium/state_trackers/nine/nine_dump.c
+++ b/src/gallium/state_trackers/nine/nine_dump.c
@@ -8,7 +8,7 @@
 
 #include "nine_dump.h"
 
-#ifdef DEBUG
+#if defined(DEBUG) || !defined(NDEBUG)
 
 static char __thread tls[128];
 
@@ -810,4 +810,4 @@ nine_dump_D3DCAPS9(unsigned ch, const D3DCAPS9 *caps)
 FREE(s);
 }
 
-#endif /* DEBUG */
+#endif /* DEBUG || !NDEBUG */
diff --git a/src/gallium/state_trackers/nine/nine_dump.h 
b/src/gallium/state_trackers/nine/nine_dump.h
index a0ffe7bf6ab..72342557d77 100644
--- a/src/gallium/state_trackers/nine/nine_dump.h
+++ b/src/gallium/state_trackers/nine/nine_dump.h
@@ -16,7 +16,7 @@ const char *nine_D3DPRESENTFLAG_to_str(DWORD);
 const char *nine_D3DLOCK_to_str(DWORD);
 const char *nine_D3DSAMP_to_str(DWORD);
 
-#ifdef DEBUG
+#if defined(DEBUG) || !defined(NDEBUG)
 
 void
 nine_dum

[Mesa-dev] [PATCH 3/6] st/nine: Fix volumetexture dtor on ctor failure

2018-12-16 Thread Axel Davy
The dtor is called on allocation failure,
thus we must check the volumes are allocated
before trying to release them.

Signed-off-by: Axel Davy 
Cc: mesa-sta...@lists.freedesktop.org
---
 src/gallium/state_trackers/nine/volumetexture9.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/nine/volumetexture9.c 
b/src/gallium/state_trackers/nine/volumetexture9.c
index 5dec4844864..c7191bce688 100644
--- a/src/gallium/state_trackers/nine/volumetexture9.c
+++ b/src/gallium/state_trackers/nine/volumetexture9.c
@@ -141,7 +141,8 @@ NineVolumeTexture9_dtor( struct NineVolumeTexture9 *This )
 
 if (This->volumes) {
 for (l = 0; l <= This->base.base.info.last_level; ++l)
-NineUnknown_Destroy(>volumes[l]->base);
+if (This->volumes[l])
+NineUnknown_Destroy(>volumes[l]->base);
 FREE(This->volumes);
 }
 
-- 
2.19.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/6] st/nine: Bind src not dst in nine_context_box_upload

2018-12-16 Thread Axel Davy
nine_context_box_upload uploads a ram buffer (from src)
to a pipe_resource (dst).
We already have a refcount on the pipe_resource,
what needs to be protected from release is the ram buffer,
thus a reference to src.

Signed-off-by: Axel Davy 
Cc: mesa-sta...@lists.freedesktop.org
---
 src/gallium/state_trackers/nine/nine_state.c | 6 +++---
 src/gallium/state_trackers/nine/nine_state.h | 2 +-
 src/gallium/state_trackers/nine/surface9.c   | 2 +-
 src/gallium/state_trackers/nine/volume9.c| 2 +-
 4 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/src/gallium/state_trackers/nine/nine_state.c 
b/src/gallium/state_trackers/nine/nine_state.c
index 273be88e2b8..4872e24f439 100644
--- a/src/gallium/state_trackers/nine/nine_state.c
+++ b/src/gallium/state_trackers/nine/nine_state.c
@@ -2434,7 +2434,7 @@ CSMT_ITEM_NO_WAIT_WITH_COUNTER(nine_context_range_upload,
 }
 
 CSMT_ITEM_NO_WAIT_WITH_COUNTER(nine_context_box_upload,
-   ARG_BIND_REF(struct NineUnknown, dst),
+   ARG_BIND_REF(struct NineUnknown, src_ref),
ARG_BIND_RES(struct pipe_resource, res),
ARG_VAL(unsigned, level),
ARG_COPY_REF(struct pipe_box, dst_box),
@@ -2449,8 +2449,8 @@ CSMT_ITEM_NO_WAIT_WITH_COUNTER(nine_context_box_upload,
 struct pipe_transfer *transfer = NULL;
 uint8_t *map;
 
-/* We just bind dst for the bind count */
-(void)dst;
+/* Binding src_ref avoids release before upload */
+(void)src_ref;
 
 map = pipe->transfer_map(pipe,
  res,
diff --git a/src/gallium/state_trackers/nine/nine_state.h 
b/src/gallium/state_trackers/nine/nine_state.h
index 51e5e326527..8de9f84a256 100644
--- a/src/gallium/state_trackers/nine/nine_state.h
+++ b/src/gallium/state_trackers/nine/nine_state.h
@@ -568,7 +568,7 @@ nine_context_range_upload(struct NineDevice9 *device,
 void
 nine_context_box_upload(struct NineDevice9 *device,
 unsigned *counter,
-struct NineUnknown *dst,
+struct NineUnknown *src_ref,
 struct pipe_resource *res,
 unsigned level,
 const struct pipe_box *dst_box,
diff --git a/src/gallium/state_trackers/nine/surface9.c 
b/src/gallium/state_trackers/nine/surface9.c
index 5fd662fa049..10518219a0a 100644
--- a/src/gallium/state_trackers/nine/surface9.c
+++ b/src/gallium/state_trackers/nine/surface9.c
@@ -660,7 +660,7 @@ NineSurface9_CopyMemToDefault( struct NineSurface9 *This,
 
 nine_context_box_upload(This->base.base.device,
 >pending_uploads_counter,
-(struct NineUnknown *)This,
+(struct NineUnknown *)From,
 r_dst,
 This->level,
 _box,
diff --git a/src/gallium/state_trackers/nine/volume9.c 
b/src/gallium/state_trackers/nine/volume9.c
index ec811aeba13..840f01dae10 100644
--- a/src/gallium/state_trackers/nine/volume9.c
+++ b/src/gallium/state_trackers/nine/volume9.c
@@ -449,7 +449,7 @@ NineVolume9_CopyMemToDefault( struct NineVolume9 *This,
 
 nine_context_box_upload(This->base.device,
 >pending_uploads_counter,
-(struct NineUnknown *)This,
+(struct NineUnknown *)From,
 r_dst,
 This->level,
 _box,
-- 
2.19.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/6] st/nine: Add src reference to nine_context_range_upload

2018-12-16 Thread Axel Davy
Just like nine_context_box_upload, nine_context_range_upload
should reference the src, which holds the ram source buffer.

Fixes: https://github.com/iXit/Mesa-3D/issues/327
Signed-off-by: Axel Davy 
Cc: mesa-sta...@lists.freedesktop.org
---
 src/gallium/state_trackers/nine/buffer9.h| 4 +++-
 src/gallium/state_trackers/nine/nine_state.c | 4 
 src/gallium/state_trackers/nine/nine_state.h | 1 +
 3 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/nine/buffer9.h 
b/src/gallium/state_trackers/nine/buffer9.h
index b04a0a721bb..1803d8d6405 100644
--- a/src/gallium/state_trackers/nine/buffer9.h
+++ b/src/gallium/state_trackers/nine/buffer9.h
@@ -104,7 +104,9 @@ NineBuffer9_Upload( struct NineBuffer9 *This )
 struct NineDevice9 *device = This->base.base.device;
 
 assert(This->base.pool == D3DPOOL_MANAGED && This->managed.dirty);
-nine_context_range_upload(device, >managed.pending_upload, 
This->base.resource,
+nine_context_range_upload(device, >managed.pending_upload,
+  (struct NineUnknown *)This,
+  This->base.resource,
   This->managed.dirty_box.x,
   This->managed.dirty_box.width,
   (char *)This->managed.data + 
This->managed.dirty_box.x);
diff --git a/src/gallium/state_trackers/nine/nine_state.c 
b/src/gallium/state_trackers/nine/nine_state.c
index 4872e24f439..02673c1f6ed 100644
--- a/src/gallium/state_trackers/nine/nine_state.c
+++ b/src/gallium/state_trackers/nine/nine_state.c
@@ -2423,6 +2423,7 @@ CSMT_ITEM_NO_WAIT(nine_context_gen_mipmap,
 }
 
 CSMT_ITEM_NO_WAIT_WITH_COUNTER(nine_context_range_upload,
+   ARG_BIND_REF(struct NineUnknown, src_ref),
ARG_BIND_RES(struct pipe_resource, res),
ARG_VAL(unsigned, offset),
ARG_VAL(unsigned, size),
@@ -2430,6 +2431,9 @@ CSMT_ITEM_NO_WAIT_WITH_COUNTER(nine_context_range_upload,
 {
 struct nine_context *context = >context;
 
+/* Binding src_ref avoids release before upload */
+(void)src_ref;
+
 context->pipe->buffer_subdata(context->pipe, res, 0, offset, size, data);
 }
 
diff --git a/src/gallium/state_trackers/nine/nine_state.h 
b/src/gallium/state_trackers/nine/nine_state.h
index 8de9f84a256..55960007bfb 100644
--- a/src/gallium/state_trackers/nine/nine_state.h
+++ b/src/gallium/state_trackers/nine/nine_state.h
@@ -560,6 +560,7 @@ nine_context_gen_mipmap(struct NineDevice9 *device,
 void
 nine_context_range_upload(struct NineDevice9 *device,
   unsigned *counter,
+  struct NineUnknown *src_ref,
   struct pipe_resource *res,
   unsigned offset,
   unsigned size,
-- 
2.19.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/6] st/nine: Switch to presentation buffer if resize is detected

2018-12-16 Thread Axel Davy
This enables to match the window size
on resize on all cases, as it only works
currently with presentation buffers.

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/swapchain9.c | 37 +++-
 1 file changed, 36 insertions(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/nine/swapchain9.c 
b/src/gallium/state_trackers/nine/swapchain9.c
index 138e8816a05..6c22be24c7c 100644
--- a/src/gallium/state_trackers/nine/swapchain9.c
+++ b/src/gallium/state_trackers/nine/swapchain9.c
@@ -715,7 +715,7 @@ present( struct NineSwapChain9 *This,
 struct pipe_fence_handle *fence;
 HRESULT hr;
 struct pipe_blit_info blit;
-int target_width, target_height, target_depth;
+int target_width, target_height, target_depth, i;
 
 DBG("present: This=%p pSourceRect=%p pDestRect=%p "
 "pDirtyRegion=%p hDestWindowOverride=%p"
@@ -753,6 +753,41 @@ present( struct NineSwapChain9 *This,
 ID3DPresent_GetWindowInfo(This->present, hDestWindowOverride, 
_width, _height, _depth);
 (void)target_depth;
 
+/* Switch to using presentation buffers on window resize.
+ * Note: Most apps should resize the d3d back buffers when
+ * a window resize is detected, which will result in a call to
+ * NineSwapChain9_Resize. Thus everything will get released,
+ * and it will switch back to not using separate presentation
+ * buffers. */
+if (!This->present_buffers[0] &&
+(target_width != resource->width0 || target_height != 
resource->height0)) {
+BOOL failure = false;
+struct pipe_resource *new_resource[This->num_back_buffers];
+D3DWindowBuffer *new_handles[This->num_back_buffers];
+for (i = 0; i < This->num_back_buffers; i++) {
+/* Note: if (!new_handles[i]), new_resource[i]
+ * gets released and contains NULL */
+create_present_buffer(This, target_width, target_height, 
_resource[i], _handles[i]);
+if (!new_handles[i])
+failure = true;
+}
+if (failure) {
+for (i = 0; i < This->num_back_buffers; i++) {
+if (new_resource[i])
+pipe_resource_reference(_resource[i], NULL);
+if (new_handles[i])
+D3DWindowBuffer_release(This, new_handles[i]);
+}
+} else {
+for (i = 0; i < This->num_back_buffers; i++) {
+D3DWindowBuffer_release(This, This->present_handles[i]);
+This->present_handles[i] = new_handles[i];
+pipe_resource_reference(>present_buffers[i], 
new_resource[i]);
+pipe_resource_reference(_resource[i], NULL);
+}
+}
+}
+
 pipe = NineDevice9_GetPipe(This->base.device);
 
 if (This->present_buffers[0]) {
-- 
2.19.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/6] st/nine: Increase the limit of cached ff shaders

2018-12-16 Thread Axel Davy
100 is too small for some games, which triggers recompilations
every frame. Increase to 1024.

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/nine_ff.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/state_trackers/nine/nine_ff.c 
b/src/gallium/state_trackers/nine/nine_ff.c
index 261be276ad8..cb77d6915b9 100644
--- a/src/gallium/state_trackers/nine/nine_ff.c
+++ b/src/gallium/state_trackers/nine/nine_ff.c
@@ -2138,7 +2138,7 @@ nine_ff_prune_vs(struct NineDevice9 *device)
 {
 struct nine_context *context = >context;
 
-if (device->ff.num_vs > 100) {
+if (device->ff.num_vs > 1024) {
 /* could destroy the bound one here, so unbind */
 context->pipe->bind_vs_state(context->pipe, NULL);
 util_hash_table_foreach(device->ff.ht_vs, nine_ff_ht_delete_cb, NULL);
@@ -2152,7 +2152,7 @@ nine_ff_prune_ps(struct NineDevice9 *device)
 {
 struct nine_context *context = >context;
 
-if (device->ff.num_ps > 100) {
+if (device->ff.num_ps > 1024) {
 /* could destroy the bound one here, so unbind */
 context->pipe->bind_fs_state(context->pipe, NULL);
 util_hash_table_foreach(device->ff.ht_ps, nine_ff_ht_delete_cb, NULL);
-- 
2.19.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/6] st/nine: Use helper to release swapchain buffers later

2018-12-16 Thread Axel Davy
This patch introduces a structure to release the
present_handles only when they are fully released
by the server, thus making
"DestroyD3DWindowBuffer" actually release the buffer
right away when called.

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/swapchain9.c | 49 
 src/gallium/state_trackers/nine/swapchain9.h |  1 +
 2 files changed, 42 insertions(+), 8 deletions(-)

diff --git a/src/gallium/state_trackers/nine/swapchain9.c 
b/src/gallium/state_trackers/nine/swapchain9.c
index d330c855726..138e8816a05 100644
--- a/src/gallium/state_trackers/nine/swapchain9.c
+++ b/src/gallium/state_trackers/nine/swapchain9.c
@@ -128,6 +128,40 @@ D3DWindowBuffer_create(struct NineSwapChain9 *This,
 return ret;
 }
 
+static void
+D3DWindowBuffer_release(struct NineSwapChain9 *This,
+D3DWindowBuffer *present_handle)
+{
+int i;
+/* Add it to the 'pending release' list */
+for (i = 0; i < D3DPRESENT_BACK_BUFFERS_MAX_EX + 1; i++) {
+if (!This->present_handles_pending_release[i]) {
+This->present_handles_pending_release[i] = present_handle;
+break;
+}
+}
+if (i == (D3DPRESENT_BACK_BUFFERS_MAX_EX + 1)) {
+ERR("Server not releasing buffers...\n");
+assert(false);
+}
+
+/* Destroy elements of the list released by the server */
+for (i = 0; i < D3DPRESENT_BACK_BUFFERS_MAX_EX + 1; i++) {
+if (This->present_handles_pending_release[i] &&
+ID3DPresent_IsBufferReleased(This->present, 
This->present_handles_pending_release[i])) {
+/* WaitBufferReleased also waits the presentation feedback
+ * (which should arrive at about the same time),
+ * while IsBufferReleased doesn't. DestroyD3DWindowBuffer 
unfortunately
+ * checks it to release immediately all data, else the release
+ * is postponed for This->present release. To avoid leaks (we may 
handle
+ * a lot of resize), call WaitBufferReleased. */
+ID3DPresent_WaitBufferReleased(This->present, 
This->present_handles_pending_release[i]);
+ID3DPresent_DestroyD3DWindowBuffer(This->present, 
This->present_handles_pending_release[i]);
+This->present_handles_pending_release[i] = NULL;
+}
+}
+}
+
 static int
 NineSwapChain9_GetBackBufferCountForParams( struct NineSwapChain9 *This,
 D3DPRESENT_PARAMETERS *pParams );
@@ -291,7 +325,7 @@ NineSwapChain9_Resize( struct NineSwapChain9 *This,
 This->enable_threadpool = FALSE;
 
 for (i = 0; i < oldBufferCount; i++) {
-ID3DPresent_DestroyD3DWindowBuffer(This->present, 
This->present_handles[i]);
+D3DWindowBuffer_release(This, This->present_handles[i]);
 This->present_handles[i] = NULL;
 if (This->present_buffers[i])
 pipe_resource_reference(&(This->present_buffers[i]), NULL);
@@ -519,6 +553,11 @@ NineSwapChain9_dtor( struct NineSwapChain9 *This )
 FREE(This->pending_presentation[i]);
 }
 
+for (i = 0; i < D3DPRESENT_BACK_BUFFERS_MAX_EX + 1; i++) {
+if (This->present_handles_pending_release[i])
+ID3DPresent_DestroyD3DWindowBuffer(This->present, 
This->present_handles_pending_release[i]);
+}
+
 for (i = 0; i < This->num_back_buffers; i++) {
 if (This->buffers[i])
 NineUnknown_Detach(NineUnknown(This->buffers[i]));
@@ -738,13 +777,7 @@ present( struct NineSwapChain9 *This,
 create_present_buffer(This, target_width, target_height, 
_resource, _handle);
 /* Switch to the new buffer */
 if (new_handle) {
-/* WaitBufferReleased also waits the presentation feedback,
- * while IsBufferReleased doesn't. DestroyD3DWindowBuffer 
unfortunately
- * checks it to release immediately all data, else the release
- * is postponed for This->present release. To avoid leaks (we 
may handle
- * a lot of resize), call WaitBufferReleased. */
-ID3DPresent_WaitBufferReleased(This->present, 
This->present_handles[0]);
-ID3DPresent_DestroyD3DWindowBuffer(This->present, 
This->present_handles[0]);
+D3DWindowBuffer_release(This, This->present_handles[0]);
 This->present_handles[0] = new_handle;
 pipe_resource_reference(>present_buffers[0], 
new_resource);
 pipe_resource_reference(_resource, NULL);
diff --git a/src/gallium/state_trackers/nine/swapchain9.h 
b/src/gallium/state_trackers/nine/swapchain9.h
index 0fa0589d3b7..a6146445bdd 100644
--- a/src/gallium/state_trackers/nine/swapchain9.h
+++ b/src/gallium/state_trackers/nine/swapchain9.h
@@ -57,6 +57,7 @@ struct NineSwapChain9

Re: [Mesa-dev] [PATCH 6/6] radeonsi: always unmap texture CPU mappings on 32-bit CPU architectures

2018-12-14 Thread Axel Davy

Hi Marek,

That seems a good idea.
Several 32bits games have virtual address space issues as well with both 
Nine and Wine (but Nine seems a bit more affected because more libs are 
loaded).


Maybe the patch could go a little bit further by doing the same for 
buffers the first time they si_buffer_transfer_unmap is called for them ?


I've seen several nine games use a few big buffers (in DEFAULT pool) to 
store quite a lot of vertices,

in addition to the buffers frequently written to for rendering.

Unmapping those big buffers, that we are never going to write to again, 
could save a bit of that precious virtual space.


The safest test I believe, is to look only at buffers that are written 
to only once, thus always unmapping on first unmap.
For the buffers often written to, the cost of the first unmap will be 
negligible.


What do you think ?

Axel

On 14/12/2018 22:24, Marek Olšák wrote:

From: Marek Olšák 

Team Fortress 2 32-bit version runs out of the CPU address space.
---
  src/gallium/drivers/radeonsi/si_texture.c | 16 
  1 file changed, 16 insertions(+)

diff --git a/src/gallium/drivers/radeonsi/si_texture.c 
b/src/gallium/drivers/radeonsi/si_texture.c
index 95f1e8c9693..39869445b0f 100644
--- a/src/gallium/drivers/radeonsi/si_texture.c
+++ b/src/gallium/drivers/radeonsi/si_texture.c
@@ -1791,20 +1791,26 @@ static void *si_texture_transfer_map(struct 
pipe_context *ctx,
  
  		buf = trans->staging;

} else {
/* the resource is mapped directly */
offset = si_texture_get_offset(sctx->screen, tex, level, box,
 >b.b.stride,
 >b.b.layer_stride);
buf = >buffer;
}
  
+	/* Always unmap texture CPU mappings on 32-bit architectures, so that

+* we don't run out of the CPU address space.
+*/
+   if (sizeof(void*) == 4)
+   usage |= RADEON_TRANSFER_TEMPORARY;
+
if (!(map = si_buffer_map_sync_with_rings(sctx, buf, usage)))
goto fail_trans;
  
  	*ptransfer = >b.b;

return map + offset;
  
  fail_trans:

r600_resource_reference(>staging, NULL);
pipe_resource_reference(>b.b.resource, NULL);
FREE(trans);
@@ -1812,20 +1818,30 @@ fail_trans:
  }
  
  static void si_texture_transfer_unmap(struct pipe_context *ctx,

  struct pipe_transfer* transfer)
  {
struct si_context *sctx = (struct si_context*)ctx;
struct si_transfer *stransfer = (struct si_transfer*)transfer;
struct pipe_resource *texture = transfer->resource;
struct si_texture *tex = (struct si_texture*)texture;
  
+	/* Always unmap texture CPU mappings on 32-bit architectures, so that

+* we don't run out of the CPU address space.
+*/
+   if (sizeof(void*) == 4) {
+   struct r600_resource *buf =
+   stransfer->staging ? stransfer->staging : >buffer;
+
+   sctx->ws->buffer_unmap(buf->buf);
+   }
+
if ((transfer->usage & PIPE_TRANSFER_WRITE) && stransfer->staging) {
if (tex->is_depth && tex->buffer.b.b.nr_samples <= 1) {
ctx->resource_copy_region(ctx, texture, transfer->level,
  transfer->box.x, 
transfer->box.y, transfer->box.z,
  >staging->b.b, 
transfer->level,
  >box);
} else {
si_copy_from_staging_texture(ctx, stransfer);
}
}



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] docs: Document GitLab merge request process (email alternative)

2018-12-13 Thread Axel Davy

On 13/12/2018 17:57, Mathias Fröhlich wrote:

Hi,
Initially it seemed to me that I am about the only one sticking with mailing 
lists.
And I personally feel like a too small contributor to really try to influence 
your
decisions too much. But these recent hand full of mails all tell me that I am 
not
that alone. I personally did contribute to several projects during the past 
years.
All that only in part time, thus it had to be *very* efficient for myself. And 
that is
something that I achieved by a consistent 'interface' to all those projects. 
Just
my widely used and highly convenient mail client. So, all that worked in a 
sufficiently
efficient way because I could combine this kind of 'work' even with my private 
mail
that I could handle in between with that single 'interface'. So going to any 
web site
there is already a detour and having multiple of them for each such project 
gives an
even longer detour. Okay today it's mostly mesa that is left as well as a 
communication
middle end used in vizsim applications. But going away too far away from a 
mailing list
will be mostly a loss of efficiency for me.
As I said, my two cents, that should not keep you all from doing changes that 
finally
increase your all efficiency ...

best

Mathias





Hi,


I have to add my voice here as well.

Even though I do not feel able to give review for most of the mesa code 
base,

I appreciate to have all patches in the mailing list in my mail client.

From time to time, I give feedback for some set of patches, for example 
when I see patches related to dri3 or that could impact Nine.


It also enables me to get an overview of all the recent works and new 
features Nine could use.


I feel like if most patches go through MR without getting a mail 
feedback, I would not be able to do those as efficiently.


I would appreciate if I could *flag* some files or directories, and when 
a MR impacts those (for example dri3 files, gallium interface, gallium 
Nine, etc),
I could get an automated mail with a summary of the MR, in order to 
encourage me to look at it.



Yours,

Axel


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Let's talk about -DDEBUG

2018-12-13 Thread Axel Davy

On 13/12/2018 17:26, Jason Ekstrand wrote:
On Thu, Dec 13, 2018 at 5:06 AM Eric Engestrom 
mailto:eric.engest...@intel.com>> wrote:


On Wednesday, 2018-12-12 15:24:25 -0800, Dylan Baker wrote:
> In the autotools discussion I've come to realize that we also
need to talk about
> the -DDEBUG guard. It seems that there are two different uses,
and thus two
> different asks about it:
>
> - Nine (and RadeonSI?) use -DDEBUG to hide generic debugging
> - NIR and Intel (at least) use -DDEBUG to hide really expensive
checks that are
>   useful, but necessarily tank performance.
>
> The first group would like -DDEBUG in debugoptimized builds, the
second
> obviously doesn't.
>
> Is the right solution to move the first group being !NDEBUG, or
would it be
> better to split DEBUG into two different defines such as
DEBUG_MESSAGES and
> EXPENSIVE_VALIDATION (paint the bikeshed whatever color you
like), with the
> first for both debug and debugoptimized and the second only in
debug builds?

Replacing DEBUG with !NDEBUG is obviously trivially simpler, but I
think
the right thing would be to split it into !NDEBUG and
EXPENSIVE_VALIDATION
(the color suits me just fine :P), as both solutions satisfy the first
group but only the latter solution satisfies the 2nd group.

I think a first pass might be to simply
s/DEBUG/EXPENSIVE_VALIDATION/ so
that it expresses the intent more clearly, with a prior patch to
convert
Nine and other obvious !NDEBUG candidates, then, later on, some of the
EXPENSIVE_VALIDATION can be promoted to !NDEBUG on a case-by-case
basis.


I think this whole discussion is way over-thinking this. With autools, 
we had two options: --enable-debug or not which, as I understand it, 
corresponds to release and debug.  Great.  Now meson adds a new one.  
Let's just pick something that makes sense and call it a day; it 
really doesn't matter.  Anyone who wants more control can just set 
their own CFLAGS.  Regardless of what we do, we're not really loosing 
anything by doing this as people who build Nine today with 
--enable-debug are getting an unoptimized build the same as they would 
with -Dbuild-type=debug. Users/devs can also always set -Db_ndebug 
manually to get the behavior that they want.


I don't know that I have all that strong of a preference as I'm not 
likely to use it anyway.  On the one hand, the name implies that it's 
a debug build only optimized.  This is different than CMake's 
RelWithDebugInfo which is clearly a release build with debug symbols.  
Based on that naming, i'd say we should leave asserts on.


I think the root of the issue is that different developers have tied 
way too much stuff to -DDEBUG.  The Nine people can add a 
-Dnine-logging=true flag that can turn on logging even in release 
builds.  In the NIR-based drivers, we already have environment 
variables to shut off NIR validation to make things go faster even in 
debug builds.


--Jason


Hi,


I agree with Jason that there seems to be a multitude of needs and that 
it may be hard to handle for all these needs in a simple way.


Devs who want to stress specific parts of their code can indeed use 
CFLAGS, and thus there isn't need to have a meson build mode for every 
specific need.


However I believe using a debug build option should be all that is 
needed for a user to help report bugs. If the user is investigating a 
crash, he wants to enable asserts and debug info. He may want to enable 
nine logging, etc.
Dev flags may change between releases, while the user would always have 
the same debug option to enable all it may need.


I think the autotools way was simple for the user, and the new meson way 
should be as simple. 'debugoptimized' is counter-intuitive for an user, 
who may expect all the mentioned debug info.


To me debugoptimized should be similar to debug, but with -O2.

Other dev specific debug options can be added with CFLAGS.



Axel

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] last call for autotools

2018-12-12 Thread Axel Davy

On 12/12/2018 23:06, Dylan Baker wrote:

Quoting Marek Olšák (2018-12-12 13:07:10)

On Wed, Dec 12, 2018 at 3:52 PM Rob Clark  wrote:

 On Wed, Dec 12, 2018 at 3:45 PM Marek Olšák  wrote:
 >
 > On Wed, Dec 12, 2018 at 3:37 PM Rob Clark  wrote:
 >>
 >> On Wed, Dec 12, 2018 at 3:13 PM Bas Nieuwenhuizen
 >>  wrote:
 >> >
 >> > On Wed, Dec 12, 2018 at 8:59 PM Marek Olšák  wrote:
 >> > >
 >> > > There are 2 issues with meson:
 >> > > * -DDEBUG is not present in debugoptimized builds.
 >> >
 >> > Do people expect -DDEBUG for debugoptimized? I would think that debug
 >> > optimized would be an optimized build with debug symbols, but not
 >> > expensive checks & asserts, which would match the current
 >> > debugoptimized build?
 >>
 >> please, no -DDEBUG for debugoptimized.. I use that when I want debug
 >> syms but not (for example) nir_validate and other expensive checks.
 >
 >
 > If nir_validate is so bad, perhaps it shouldn't be run at all. If you
 work on NIR and it's not important for you to run nir_validate, perhaps it
 shouldn't be run at all. It doesn't have anything to do with build systems.
 >

 I do actually want it enabled when I piglit/deqp..  for which I use
 debug builds.  But I don't want it if I'm profiling or valgrinding,
 where I use debugoptimized..


At some point, DEBUG will be replaced by !NDEBUG.

Marek

The whole point of DEBUG is to hide really expensive checks like nir validate.
If there are are asserts hidden behind debug that's wrong, those should be
behind !NDEBUG, that's what NDEBUG is for. We've talked about renaming DEBUG
because it's confusing, something like EXPENSIVE_DEBUG or EXPENSIVE_VALIDATION
or something along those lines.

The thread I linked you to previously has extensive discussion of why DEBUG was
removed from the debugoptimized build.

For CFLAGS: yes, environment variables are only read during the initial
configuration by design, they basically exist for compatibility with autotools.
You want to use -Dc_args and -Dcpp_args (for CFLAGS and CXXFLAGS respectively)
these are read by meson, meson configure, and meson reconfigure.

Dylan


Hi,

Currently nine debug log (enabled with NINE_DEBUG=all) only works with 
--enable-debug build.


There is quite a performance difference with autotools when you build 
with --enable-debug. However we often need users to use NINE_DEBUG to 
produce logs for issues.


These sorts of things should be made to work with debugoptimized if 
possible.


Currently we hide that against the DEBUG env var. Should that be 
replaced by !NDEBUG when autotools support is killed ?


Axel

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] d3dadapter9: use snprintf(..., "%s", ...) instead of strncpy

2018-12-01 Thread Axel Davy

Well, OK, I guess these snprintf call are an acceptable solution then.

Reviewed-by: Axel Davy 

On 26/11/2018 13:23, Andre Heider wrote:

On 25/11/2018 17:23, Axel Davy wrote:
Reading 
https://developers.redhat.com/blog/2018/05/24/detecting-string-truncation-with-gcc-8/ 

I think the snprintf variant suffers from the same issue, and the 
compiler is just not yet able to detect it,

and send the same warning (but it might do in later gcc versions).


In this case we care about the terminating NULL (which strncpy() does 
not ensure) and not really about the truncation, because all these 
chunks are about D3DADAPTER_IDENTIFIER9.Description with a fixed size 
of 512 chars.


Even if those spots would get truncated, you won't get a warning about 
it, see [0] ;)


Probably a better fix would be to copy with a max size of 
sizeof(drvid->Description)-1 and do

drvid->Description[sizeof(drvid->Description)-1] = '\0';
(Though the webpage says only doing the assignment should be 
sufficient to please gcc).


Sure, but then mesa might as well import a helper like strlcpy() [1] 
instead of doing that locally.


For these few uncritical spots I think snprintf() is just fine, even 
if a format specifier of just "%s" looks weird. But that's your call.


Thanks,
Andre

[0] 97ae5a85 "meson+autotools: get rid of spammy GCC warning 
-Wformat-truncation"

[1] https://cgit.freedesktop.org/libbsd/tree/src/strlcpy.c



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] d3dadapter9: use snprintf(..., "%s", ...) instead of strncpy

2018-11-25 Thread Axel Davy
Reading 
https://developers.redhat.com/blog/2018/05/24/detecting-string-truncation-with-gcc-8/
I think the snprintf variant suffers from the same issue, and the 
compiler is just not yet able to detect it,

and send the same warning (but it might do in later gcc versions).

Probably a better fix would be to copy with a max size of 
sizeof(drvid->Description)-1 and do

drvid->Description[sizeof(drvid->Description)-1] = '\0';
(Though the webpage says only doing the assignment should be sufficient 
to please gcc).


Axel

On 25/11/2018 10:49, Andre Heider wrote:

Fixes -Wstringop-truncation compiler warnings.
See f836d799f9066adf58f36 "intel/decoder: use snprintf(..., "%s", ...) instead of 
strncpy"

Signed-off-by: Andre Heider 
---
  src/gallium/targets/d3dadapter9/description.c | 27 ---
  src/gallium/targets/d3dadapter9/drm.c |  8 +++---
  2 files changed, 22 insertions(+), 13 deletions(-)

diff --git a/src/gallium/targets/d3dadapter9/description.c 
b/src/gallium/targets/d3dadapter9/description.c
index c0a86782f8..a3e4cd6177 100644
--- a/src/gallium/targets/d3dadapter9/description.c
+++ b/src/gallium/targets/d3dadapter9/description.c
@@ -20,6 +20,7 @@
   * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
   * USE OR OTHER DEALINGS IN THE SOFTWARE. */
  
+#include 

  #include 
  #include "adapter9.h"
  
@@ -239,7 +240,7 @@ d3d_match_vendor_id( D3DADAPTER_IDENTIFIER9* drvid,

  DBG("unknown vendor 0x4%x, emulating 0x4%x\n", drvid->VendorId, 
fallback_ven);
  drvid->VendorId = fallback_ven;
  drvid->DeviceId = fallback_dev;
-strncpy(drvid->Description, fallback_name, sizeof(drvid->Description));
+snprintf(drvid->Description, sizeof(drvid->Description), "%s", 
fallback_name);
  }
  
  /* fill in driver name and version */

@@ -277,46 +278,54 @@ void d3d_fill_cardname(D3DADAPTER_IDENTIFIER9* drvid) {
  case HW_VENDOR_INTEL:
  for (i = 0; i < sizeof(cards_intel) / sizeof(cards_intel[0]); i++) {
  if (strstr(drvid->Description, cards_intel[i].mesaname)) {
-strncpy(drvid->Description, cards_intel[i].d3d9name, 
sizeof(drvid->Description));
+snprintf(drvid->Description, sizeof(drvid->Description),
+ "%s", cards_intel[i].d3d9name);
  return;
  }
  }
  /* use a fall-back if nothing matches */
  DBG("Unknown card name %s!\n", drvid->DeviceName);
-strncpy(drvid->Description, cards_intel[0].d3d9name, 
sizeof(drvid->Description));
+snprintf(drvid->Description, sizeof(drvid->Description),
+ "%s", cards_intel[0].d3d9name);
  break;
  case HW_VENDOR_VMWARE:
  for (i = 0; i < sizeof(cards_vmware) / sizeof(cards_vmware[0]); i++) {
  if (strstr(drvid->Description, cards_vmware[i].mesaname)) {
-strncpy(drvid->Description, cards_vmware[i].d3d9name, 
sizeof(drvid->Description));
+snprintf(drvid->Description, sizeof(drvid->Description),
+ "%s", cards_vmware[i].d3d9name);
  return;
  }
  }
  /* use a fall-back if nothing matches */
  DBG("Unknown card name %s!\n", drvid->DeviceName);
-strncpy(drvid->Description, cards_vmware[0].d3d9name, 
sizeof(drvid->Description));
+snprintf(drvid->Description, sizeof(drvid->Description),
+ "%s", cards_vmware[0].d3d9name);
  break;
  case HW_VENDOR_AMD:
  for (i = 0; i < sizeof(cards_amd) / sizeof(cards_amd[0]); i++) {
  if (strstr(drvid->Description, cards_amd[i].mesaname)) {
-strncpy(drvid->Description, cards_amd[i].d3d9name, 
sizeof(drvid->Description));
+snprintf(drvid->Description, sizeof(drvid->Description),
+ "%s", cards_amd[i].d3d9name);
  return;
  }
  }
  /* use a fall-back if nothing matches */
  DBG("Unknown card name %s!\n", drvid->DeviceName);
-strncpy(drvid->Description, cards_amd[0].d3d9name, 
sizeof(drvid->Description));
+snprintf(drvid->Description, sizeof(drvid->Description),
+ "%s", cards_amd[0].d3d9name);
  break;
  case HW_VENDOR_NVIDIA:
  for (i = 0; i < sizeof(cards_nvidia) / sizeof(cards_nvidia[0]); i++) {
  if (strstr(drvid->Description, cards_nvidia[i].mesaname)) {
-strncpy(drvid->Description, cards_nvidia[i].d3d9name, 
sizeof(drvid->Description));
+snprintf(drvid->Description, sizeof(drvid->Description),
+ "%s", cards_nvidia[i].d3d9name);
  return;
  }
  }
  /* use a fall-back if nothing matches */
  DBG("Unknown card name %s!\n", drvid->DeviceName);
-strncpy(drvid->Description, cards_nvidia[0].d3d9name, 
sizeof(drvid->Description));
+

Re: [Mesa-dev] [PATCH] st/mesa: disable L3 thread pinning

2018-11-12 Thread Axel Davy

Is there any replacement plan with a new feature ?

Axel

On 12/11/2018 21:45, Marek Olšák wrote:

From: Marek Olšák 

This implementation can have massive drawbacks.

Cc: 18.3 
---
  src/mesa/state_tracker/st_manager.c | 9 -
  1 file changed, 9 deletions(-)

diff --git a/src/mesa/state_tracker/st_manager.c 
b/src/mesa/state_tracker/st_manager.c
index 690d5bc2313..076ad42646d 100644
--- a/src/mesa/state_tracker/st_manager.c
+++ b/src/mesa/state_tracker/st_manager.c
@@ -1065,29 +1065,20 @@ st_api_make_current(struct st_api *stapi, struct 
st_context_iface *stctxi,
   ret = _mesa_make_current(st->ctx, incomplete, incomplete);
}
  
st_framebuffer_reference(, NULL);

st_framebuffer_reference(, NULL);
  
/* Purge the context's winsys_buffers list in case any

 * of the referenced drawables no longer exist.
 */
st_framebuffers_purge(st);
-
-  /* Notify the driver that the context thread may have been changed.
-   * This should pin all driver threads to a specific L3 cache for optimal
-   * performance on AMD Zen CPUs.
-   */
-  struct glthread_state *glthread = st->ctx->GLThread;
-  thrd_t *upper_thread = glthread ? >queue.threads[0] : NULL;
-
-  util_context_thread_changed(st->pipe, upper_thread);
 }
 else {
ret = _mesa_make_current(NULL, NULL, NULL);
 }
  
 return ret;

  }
  
  
  static void



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/2] Update on thread_submit

2018-11-10 Thread Axel Davy
I just wanted to write a cover letter for this patch to describe
a feature not many user may know about.

On d3d9, frames are supposed to be displayed in the order
they were produced, with no frame skipped.

Thus when vsync is off, the newly produced frames replace the last frame during
refresh, thus creating screen tearing (unless your compositor composites
fullscreen apps, but that's another story).

Mesa ogl backend currently should tear as well when vsync is off.


While this is not in the spec, nine allows to enable a feature
often named 'triple buffering'.
It is named that way because intuitively the idea is to have one buffer
to render to, one buffer on screen, and one buffer more recent than the buffer
on screen that is ready to go on screen at next refresh.

When the refresh is done, we replace the buffer on screen with the more
up-to-date buffer.
When a new frame is rendered, the buffer that is not on screen,
but could go on screen are swapped.

(Ofc you have don't switch the buffer on screen if you don't have any
newer frame).

This enables to render at any framerate (even above screen refresh)
without tearing.

This feature on nine is enabled with the env var (also drirc conf option)
tearfree_discard=true.


One issue though is that currently buffers are sent to the server while
rendering is not finished.

This behaviour, done by mesa ogl as well (both X and Wayland),
is probably done because:
. This helps reducing compositor lag.
  The compositing operation is scheduled ahead in the gpu pipeline.
. As soon as the application updates its window, it can fetch the content via
  some API, or rely on some window message communication to occur.
  You need thus to send the updated buffer when ogl swapbuffer() is called.

When fullscreen and not composited, however, we run into the following issue:
It is possible the buffer gets planned for a pageflip (replace current buffer
on screen), but it is not finished rendering when the pageflip occurs.
Thus the pageflip fails and the previous buffer stays on screen.

This can make a game fps feel smaller on screen that what it is rendering at,
and adds lag.


Fortunately for nine, we can reasonnably assume the application won't read the
window content just after presenting it with d3d (in game screenshots are
implemented by looking at the rendering buffer, not the window content,
and beside that, which app would want to access that content ?).

In order to support DRI_PRIME systems without artefacts, before the kernel
supported dma-buf synchronization, nine added support for thread_submit drirc
env var, which basically uses a thread to delay sending the last rendered
buffer to the X server until it is finished rendering.
This feature can be used without DRI_PRIME as well.

This patchset enables to use thread_submit=true with tearfree_discard=true,
thus enabling to have 'triple buffering' without the mentionned issue.


Another solution of course is to use vsync, but some games' dynamics
can work better without (because cpu time gets eaten waiting vsync).


With vsync, some users apparently have issues with
pageflip getting missed, and in that case thread_submit=true can be used
to increase smoothness.




We thus recommand the following configurations (assuming fullscreen and
not composited):

Game with vsync:
Use thread_submit=true if your graphic card rendering rate is close or
slightly above the screen refresh rate (you are more likely to have
missed pageflips).
You can also enable always, it shouldn't hurt

Game without vsync:
If you don't mind tearing, nothing particular.
If you mind tearing, thread_submit=true tearfree_discard=true


tearfree_discard=true doesn't impact vsync, and both env vars
don't affect mesa ogl, thus you can just set
export tearfree_discard=true
export thread_submit=true
 in your
~/.bashrc to have them always on.


Note: if you don't see tearing with vsync off and without
these options, it means you get composited.
Getting composited means a small perf impact
and possibly a small lag.
Either disable composition manually (alt+shift+f12 on kwin
for example) when needed, or use a wine patch like this one:
https://github.com/ValveSoftware/wine/commit/141ba5cf73029029a5a0bd2cdcfd5f9f9ab7ee7b


Axel Davy (2):
  st/nine: Allow 'triple buffering' with thread_submit
  st/nine: Remove thread_submit warning

 src/gallium/state_trackers/nine/swapchain9.c | 66 +++-
 src/gallium/state_trackers/nine/swapchain9.h |  1 +
 src/gallium/targets/d3dadapter9/drm.c|  3 -
 3 files changed, 50 insertions(+), 20 deletions(-)

-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] st/nine: Remove thread_submit warning

2018-11-10 Thread Axel Davy
thread_submit can be useful even without DRI_PRIME,
as it can help avoid missed pageflips.

Signed-off-by: Axel Davy 
---
 src/gallium/targets/d3dadapter9/drm.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/src/gallium/targets/d3dadapter9/drm.c 
b/src/gallium/targets/d3dadapter9/drm.c
index 85b3e10633e..6fb8caf5c2f 100644
--- a/src/gallium/targets/d3dadapter9/drm.c
+++ b/src/gallium/targets/d3dadapter9/drm.c
@@ -279,9 +279,6 @@ drm_create_adapter( int fd,
 DBG("You have set a non standard throttling value in combination with 
thread_submit."
 "We advise to use a throttling value of -2/0");
 }
-if (ctx->base.thread_submit && !different_device)
-DBG("You have set thread_submit but do not use a different device than 
the server."
-"You should not expect any benefit.");
 
 if (driCheckOption(, "override_vendorid", DRI_INT)) {
 override_vendorid = driQueryOptioni(, 
"override_vendorid");
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] st/nine: Allow 'triple buffering' with thread_submit

2018-11-10 Thread Axel Davy
The path allowing triple buffering behaviour wasn't implemented
yet for thread_submit

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/swapchain9.c | 66 +++-
 src/gallium/state_trackers/nine/swapchain9.h |  1 +
 2 files changed, 50 insertions(+), 17 deletions(-)

diff --git a/src/gallium/state_trackers/nine/swapchain9.c 
b/src/gallium/state_trackers/nine/swapchain9.c
index 85ee51a0ae7..f86ab81ab97 100644
--- a/src/gallium/state_trackers/nine/swapchain9.c
+++ b/src/gallium/state_trackers/nine/swapchain9.c
@@ -28,6 +28,7 @@
 #include "nine_pipe.h"
 #include "nine_dump.h"
 
+#include "util/u_atomic.h"
 #include "util/u_inlines.h"
 #include "util/u_surface.h"
 #include "hud/hud_context.h"
@@ -50,6 +51,7 @@ NineSwapChain9_ctor( struct NineSwapChain9 *This,
  D3DDISPLAYMODEEX *mode )
 {
 HRESULT hr;
+int i;
 
 DBG("This=%p pDevice=%p pPresent=%p pCTX=%p hFocusWindow=%p\n",
 This, pParams->device, pPresent, pCTX, hFocusWindow);
@@ -65,8 +67,7 @@ NineSwapChain9_ctor( struct NineSwapChain9 *This,
 This->mode = NULL;
 
 ID3DPresent_AddRef(pPresent);
-if (!This->actx->thread_submit &&
-This->base.device->minor_version_num > 2) {
+if (This->base.device->minor_version_num > 2) {
 D3DPRESENT_PARAMETERS2 params2;
 
 memset(, 0, sizeof(D3DPRESENT_PARAMETERS2));
@@ -80,6 +81,11 @@ NineSwapChain9_ctor( struct NineSwapChain9 *This,
 
 This->rendering_done = FALSE;
 This->pool = NULL;
+for (i = 0; i < D3DPRESENT_BACK_BUFFERS_MAX_EX + 1; i++) {
+This->pending_presentation[i] = calloc(1, sizeof(BOOL));
+if (!This->pending_presentation[i])
+return E_OUTOFMEMORY;
+}
 return NineSwapChain9_Resize(This, pPresentationParameters, mode);
 }
 
@@ -508,6 +514,11 @@ NineSwapChain9_dtor( struct NineSwapChain9 *This )
 if (This->pool)
 _mesa_threadpool_destroy(This, This->pool);
 
+for (i = 0; i < D3DPRESENT_BACK_BUFFERS_MAX_EX + 1; i++) {
+if (This->pending_presentation[i])
+FREE(This->pending_presentation[i]);
+}
+
 for (i = 0; i < This->num_back_buffers; i++) {
 if (This->buffers[i])
 NineUnknown_Detach(NineUnknown(This->buffers[i]));
@@ -619,6 +630,7 @@ struct end_present_struct {
 struct pipe_fence_handle *fence_to_wait;
 ID3DPresent *present;
 D3DWindowBuffer *present_handle;
+BOOL *pending_presentation;
 HWND hDestWindowOverride;
 };
 
@@ -630,6 +642,7 @@ static void work_present(void *data)
 work->screen->fence_reference(work->screen, &(work->fence_to_wait), 
NULL);
 }
 ID3DPresent_PresentBuffer(work->present, work->present_handle, 
work->hDestWindowOverride, NULL, NULL, NULL, 0);
+p_atomic_set(work->pending_presentation, FALSE);
 free(work);
 }
 
@@ -643,6 +656,8 @@ static void pend_present(struct NineSwapChain9 *This,
 work->present = This->present;
 work->present_handle = This->present_handles[0];
 work->hDestWindowOverride = hDestWindowOverride;
+work->pending_presentation = This->pending_presentation[0];
+p_atomic_set(work->pending_presentation, TRUE);
 This->tasks[0] = _mesa_threadpool_queue_task(This->pool, work_present, 
work);
 
 return;
@@ -853,6 +868,7 @@ NineSwapChain9_Present( struct NineSwapChain9 *This,
 struct pipe_resource *res = NULL;
 D3DWindowBuffer *handle_temp;
 struct threadpool_task *task_temp;
+BOOL *pending_presentation_temp;
 int i;
 HRESULT hr;
 
@@ -886,14 +902,14 @@ NineSwapChain9_Present( struct NineSwapChain9 *This,
 
 if (This->base.device->minor_version_num > 2 &&
 This->params.SwapEffect == D3DSWAPEFFECT_DISCARD &&
-This->params.PresentationInterval == D3DPRESENT_INTERVAL_IMMEDIATE &&
-!This->actx->thread_submit) {
+This->params.PresentationInterval == D3DPRESENT_INTERVAL_IMMEDIATE) {
 int next_buffer = -1;
 
 while (next_buffer == -1) {
 /* Find a free backbuffer */
 for (i = 1; i < This->num_back_buffers; i++) {
-if (ID3DPresent_IsBufferReleased(This->present, 
This->present_handles[i])) {
+if (!p_atomic_read(This->pending_presentation[i]) &&
+ID3DPresent_IsBufferReleased(This->present, 
This->present_handles[i])) {
 DBG("Found buffer released: %d\n", i);
 next_buffer = i;
 break;
@@ -904,6 +920,17 @@ NineSwapChain9_Present( struct NineSwapChain9 *This,
 ID3DPresent_WaitBufferReleaseEvent(This->present);
 }
 }
+
+/* Free the task (we already checked it is finis

Re: [Mesa-dev] [PATCH v2 2/5] gallium: Add new PIPE_CAP_MULTISAMPLED_RENDER_TO_TEXTURE

2018-11-06 Thread Axel Davy

Hi,

Is there anything to be done in the nine state trackers (or other state 
trackers).


Nine uses create_surface. Should it expect the field to be filled 
properly by the driver ?


On 06/11/2018 23:09, Kristian H. Kristensen wrote:

+   /**
+* If a driver doesn't advertise PIPE_CAP_MULTISAMPLED_RENDER_TO_TEXTURE,
+* pipe_surface::nr_samples will always be 0.
+*/

The above comment should be added to the comment below.

+   /** Number of samples for the surface.  This can be different from the
+* resource nr_samples when the resource is bound using
+* FramebufferTexture2DMultisampleEXT.
+*/
+   unsigned nr_samples:8;
+
 union pipe_surface_desc u;
  };
  



Yours,


Axel Davy

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] st/nine: fix stack corruption due to ABI mismatch

2018-11-06 Thread Axel Davy

Hi,

The three patches seem ok.
Thanks,

Reviewed-by: Axel Davy 



I assume you don't have push rights. I will push in a few days if nobody 
complains.


Yours,

Axel Davy


On 06/11/2018 09:27, Andre Heider wrote:

This fixes various crashes and hangs when using nine's 'thread_submit'
feature.

On 64bit, the thread function's data argument would just be NULL.
On 32bit, the data argument would be garbage depending on the compiler
flags (in my case -march>=core2).

Fixes: f3fa7e3068512d ("st/nine: Use WINE thread for threadpool")
Cc: mesa-sta...@lists.freedesktop.org
Signed-off-by: Andre Heider 
---
  src/gallium/state_trackers/nine/threadpool.c | 14 +-
  1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/nine/threadpool.c 
b/src/gallium/state_trackers/nine/threadpool.c
index cc62fd2579..19721aab2d 100644
--- a/src/gallium/state_trackers/nine/threadpool.c
+++ b/src/gallium/state_trackers/nine/threadpool.c
@@ -37,6 +37,7 @@
  #include "os/os_thread.h"
  #include "threadpool.h"
  
+/* POSIX thread function */

  static void *
  threadpool_worker(void *data)
  {
@@ -76,6 +77,15 @@ threadpool_worker(void *data)
  return NULL;
  }
  
+/* Windows thread function */

+static DWORD NINE_WINAPI
+wthreadpool_worker(void *data)
+{
+threadpool_worker(data);
+
+return 0;
+}
+
  struct threadpool *
  _mesa_threadpool_create(struct NineSwapChain9 *swapchain)
  {
@@ -87,7 +97,9 @@ _mesa_threadpool_create(struct NineSwapChain9 *swapchain)
  pthread_mutex_init(>m, NULL);
  pthread_cond_init(>new_work, NULL);
  
-pool->wthread = NineSwapChain9_CreateThread(swapchain, threadpool_worker, pool);

+/* This uses WINE's CreateThread, so the thread function needs to use
+ * the Windows ABI */
+pool->wthread = NineSwapChain9_CreateThread(swapchain, wthreadpool_worker, 
pool);
  if (!pool->wthread) {
  /* using pthread as fallback */
  pthread_create(>pthread, NULL, threadpool_worker, pool);



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2] st/nine: Reduce MaxSimultaneousTextures to 8

2018-10-24 Thread Axel Davy
Windows drivers don't set this flag (which affects ff) to more than 8.

Do the same in case some games check for 8.

v2: Remove any dependence on MaxSimultaneousTextures. For non-ff
the number of textures is 16 when the device is able of vs/ps3.
Add this requirement of 16 textures to the driver requirements.

Signed-off-by: Axel Davy 
---
Thanks to our tester iive who spotted the issue.
 src/gallium/state_trackers/nine/adapter9.c | 9 -
 src/gallium/state_trackers/nine/device9.c  | 8 
 2 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/src/gallium/state_trackers/nine/adapter9.c 
b/src/gallium/state_trackers/nine/adapter9.c
index 2fa92e4207b..0634d5918ce 100644
--- a/src/gallium/state_trackers/nine/adapter9.c
+++ b/src/gallium/state_trackers/nine/adapter9.c
@@ -77,7 +77,9 @@ NineAdapter9_ctor( struct NineAdapter9 *This,
 hal->get_shader_param(hal, PIPE_SHADER_VERTEX,
   PIPE_SHADER_CAP_MAX_INPUTS) < 16 ||
 hal->get_shader_param(hal, PIPE_SHADER_FRAGMENT,
-  PIPE_SHADER_CAP_MAX_INPUTS) < 10) {
+  PIPE_SHADER_CAP_MAX_INPUTS) < 10 ||
+hal->get_shader_param(hal, PIPE_SHADER_FRAGMENT,
+  PIPE_SHADER_CAP_MAX_TEXTURE_SAMPLERS) < 16) {
 ERR("Your card is not supported by Gallium Nine. Minimum requirement "
 "is >= r500, >= nv50, >= i965\n");
 return D3DERR_DRIVERINTERNALERROR;
@@ -789,10 +791,7 @@ NineAdapter9_GetDeviceCaps( struct NineAdapter9 *This,
 
 pCaps->MaxTextureBlendStages = 8; /* XXX wine */
 (DWORD)screen->get_param(screen, PIPE_CAP_BLEND_EQUATION_SEPARATE);
-pCaps->MaxSimultaneousTextures = screen->get_shader_param(screen,
-PIPE_SHADER_FRAGMENT, PIPE_SHADER_CAP_MAX_TEXTURE_SAMPLERS);
-if (pCaps->MaxSimultaneousTextures > NINE_MAX_SAMPLERS_PS)
-pCaps->MaxSimultaneousTextures = NINE_MAX_SAMPLERS_PS;
+pCaps->MaxSimultaneousTextures = 8;
 
 pCaps->VertexProcessingCaps = D3DVTXPCAPS_TEXGEN |
   D3DVTXPCAPS_TEXGEN_SPHEREMAP |
diff --git a/src/gallium/state_trackers/nine/device9.c 
b/src/gallium/state_trackers/nine/device9.c
index ae8733027e8..24c8ce062b3 100644
--- a/src/gallium/state_trackers/nine/device9.c
+++ b/src/gallium/state_trackers/nine/device9.c
@@ -2451,7 +2451,7 @@ NineDevice9_GetTexture( struct NineDevice9 *This,
 DWORD Stage,
 IDirect3DBaseTexture9 **ppTexture )
 {
-user_assert(Stage < This->caps.MaxSimultaneousTextures ||
+user_assert(Stage < NINE_MAX_SAMPLERS_PS ||
 Stage == D3DDMAPSAMPLER ||
 (Stage >= D3DVERTEXTEXTURESAMPLER0 &&
  Stage <= D3DVERTEXTEXTURESAMPLER3), D3DERR_INVALIDCALL);
@@ -2478,7 +2478,7 @@ NineDevice9_SetTexture( struct NineDevice9 *This,
 
 DBG("This=%p Stage=%u pTexture=%p\n", This, Stage, pTexture);
 
-user_assert(Stage < This->caps.MaxSimultaneousTextures ||
+user_assert(Stage < NINE_MAX_SAMPLERS_PS ||
 Stage == D3DDMAPSAMPLER ||
 (Stage >= D3DVERTEXTEXTURESAMPLER0 &&
  Stage <= D3DVERTEXTEXTURESAMPLER3), D3DERR_INVALIDCALL);
@@ -2552,7 +2552,7 @@ NineDevice9_GetSamplerState( struct NineDevice9 *This,
  D3DSAMPLERSTATETYPE Type,
  DWORD *pValue )
 {
-user_assert(Sampler < This->caps.MaxSimultaneousTextures ||
+user_assert(Sampler < NINE_MAX_SAMPLERS_PS ||
 Sampler == D3DDMAPSAMPLER ||
 (Sampler >= D3DVERTEXTEXTURESAMPLER0 &&
  Sampler <= D3DVERTEXTEXTURESAMPLER3), D3DERR_INVALIDCALL);
@@ -2575,7 +2575,7 @@ NineDevice9_SetSamplerState( struct NineDevice9 *This,
 DBG("This=%p Sampler=%u Type=%s Value=%08x\n", This,
 Sampler, nine_D3DSAMP_to_str(Type), Value);
 
-user_assert(Sampler < This->caps.MaxSimultaneousTextures ||
+user_assert(Sampler < NINE_MAX_SAMPLERS_PS ||
 Sampler == D3DDMAPSAMPLER ||
 (Sampler >= D3DVERTEXTEXTURESAMPLER0 &&
  Sampler <= D3DVERTEXTEXTURESAMPLER3), D3DERR_INVALIDCALL);
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 04/12] st/nine: Mark transform matrices dirty for D3DSBT_ALL

2018-10-24 Thread Axel Davy
D3DSBT_ALL stateblocks capture the transform matrices.

Fixes some d3d test programs not displaying properly.

Signed-off-by: Axel Davy 
---
Notice without the previous patch, D3DSBT_ALL stateblocks
would send hundreds of identity matrices to the context
every apply.
 src/gallium/state_trackers/nine/device9.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/nine/device9.c 
b/src/gallium/state_trackers/nine/device9.c
index 3b174587a44..25a8172b3fd 100644
--- a/src/gallium/state_trackers/nine/device9.c
+++ b/src/gallium/state_trackers/nine/device9.c
@@ -2376,13 +2376,24 @@ NineDevice9_CreateStateBlock( struct NineDevice9 *This,
   NINE_STATE_IDXBUF |
   NINE_STATE_FF_MATERIAL |
   NINE_STATE_BLEND_COLOR |
-  NINE_STATE_SAMPLE_MASK;
+  NINE_STATE_SAMPLE_MASK |
+  NINE_STATE_FF_VSTRANSF;
memset(dst->changed.rs, ~0, (D3DRS_COUNT / 32) * sizeof(uint32_t));
dst->changed.rs[D3DRS_LAST / 32] |= (1 << (D3DRS_COUNT % 32)) - 1;
dst->changed.vtxbuf = (1ULL << This->caps.MaxStreams) - 1;
dst->changed.stream_freq = dst->changed.vtxbuf;
dst->changed.ucp = (1 << PIPE_MAX_CLIP_PLANES) - 1;
dst->changed.texture = (1 << NINE_MAX_SAMPLERS) - 1;
+   /* The doc says the projection, world, view and texture matrices
+* are saved, which would translate to:
+* dst->ff.changed.transform[0] = 0x00FF000C;
+* dst->ff.changed.transform[D3DTS_WORLD / 32] |= 1 << (D3DTS_WORLD % 
32);
+* However we assume they meant save everything (which is basically 
just the
+* above plus the other world matrices).
+*/
+   dst->ff.changed.transform[0] = 0x00FF000C;
+   for (s = 0; s < 8; s++)
+   dst->ff.changed.transform[8+s] = ~0;
 }
 NineStateBlock9_Capture(NineStateBlock9(*ppSB));
 
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 12/12] st/nine: Handle window resize when a presentation buffer is used

2018-10-24 Thread Axel Davy
Usually when a window is resized, the app calls d3d to resize the back
buffer to the window size. In some cases, it is not done,
and it expects the output resizes to the window size, even if
the back buffer size is unchanged.

This patch introduces the behaviour when a presentation buffer
is used.

ID3DPresent_GetWindowInfo is a function available with
D3DPresent v1.0, and thus we don't need to check if the
function is available.
The function had been introduced to implement this very
feature.

Signed-off-by: Axel Davy 
---
A presentation buffer is used when multisampling is used
or when thread_submit=true is used (this is useful for prime).
I have another patch that switches to presentation buffer when this
resizing behaviour is needed, however it is not ready for merge.
Having the behaviour for this subset of cases is already better than
nothing.
 src/gallium/state_trackers/nine/swapchain9.c | 31 +++-
 1 file changed, 30 insertions(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/nine/swapchain9.c 
b/src/gallium/state_trackers/nine/swapchain9.c
index aa485a6268b..cd77081e915 100644
--- a/src/gallium/state_trackers/nine/swapchain9.c
+++ b/src/gallium/state_trackers/nine/swapchain9.c
@@ -662,6 +662,7 @@ present( struct NineSwapChain9 *This,
 struct pipe_fence_handle *fence;
 HRESULT hr;
 struct pipe_blit_info blit;
+int target_width, target_height, target_depth;
 
 DBG("present: This=%p pSourceRect=%p pDestRect=%p "
 "pDirtyRegion=%p hDestWindowOverride=%p"
@@ -696,6 +697,9 @@ present( struct NineSwapChain9 *This,
 if (This->params.SwapEffect == D3DSWAPEFFECT_DISCARD)
 handle_draw_cursor_and_hud(This, resource);
 
+ID3DPresent_GetWindowInfo(This->present, hDestWindowOverride, 
_width, _height, _depth);
+(void)target_depth;
+
 pipe = NineDevice9_GetPipe(This->base.device);
 
 if (This->present_buffers[0]) {
@@ -710,6 +714,29 @@ present( struct NineSwapChain9 *This,
 blit.src.box.width = resource->width0;
 blit.src.box.height = resource->height0;
 
+/* Reallocate a new presentation buffer if the target window
+ * size has changed */
+if (target_width != This->present_buffers[0]->width0 ||
+target_height != This->present_buffers[0]->height0) {
+struct pipe_resource *new_resource;
+D3DWindowBuffer *new_handle;
+
+create_present_buffer(This, target_width, target_height, 
_resource, _handle);
+/* Switch to the new buffer */
+if (new_handle) {
+/* WaitBufferReleased also waits the presentation feedback,
+ * while IsBufferReleased doesn't. DestroyD3DWindowBuffer 
unfortunately
+ * checks it to release immediately all data, else the release
+ * is postponed for This->present release. To avoid leaks (we 
may handle
+ * a lot of resize), call WaitBufferReleased. */
+ID3DPresent_WaitBufferReleased(This->present, 
This->present_handles[0]);
+ID3DPresent_DestroyD3DWindowBuffer(This->present, 
This->present_handles[0]);
+This->present_handles[0] = new_handle;
+pipe_resource_reference(>present_buffers[0], 
new_resource);
+pipe_resource_reference(_resource, NULL);
+}
+}
+
 resource = This->present_buffers[0];
 
 blit.dst.resource = resource;
@@ -723,7 +750,9 @@ present( struct NineSwapChain9 *This,
 blit.dst.box.height = resource->height0;
 
 blit.mask = PIPE_MASK_RGBA;
-blit.filter = PIPE_TEX_FILTER_NEAREST;
+blit.filter = (blit.dst.box.width == blit.src.box.width &&
+   blit.dst.box.height == blit.src.box.height) ?
+  PIPE_TEX_FILTER_NEAREST : PIPE_TEX_FILTER_LINEAR;
 blit.scissor_enable = FALSE;
 blit.alpha_blend = FALSE;
 
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 11/12] d3dadapter: Fix wrong naming in header file

2018-10-24 Thread Axel Davy
GetWindowInfo used to be GetWindowSize before gallium
nine was merged. A left-over remained...

Signed-off-by: Axel Davy 
---
 include/d3dadapter/present.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/d3dadapter/present.h b/include/d3dadapter/present.h
index 95e8d679e35..0325ebc511f 100644
--- a/include/d3dadapter/present.h
+++ b/include/d3dadapter/present.h
@@ -125,7 +125,7 @@ struct ID3DPresent
 #define ID3DPresent_SetCursorPos(p,a) (p)->lpVtbl->SetCursorPos(p,a)
 #define ID3DPresent_SetCursor(p,a,b,c) (p)->lpVtbl->SetCursor(p,a,b,c)
 #define ID3DPresent_SetGammaRamp(p,a,b) (p)->lpVtbl->SetGammaRamp(p,a,b)
-#define ID3DPresent_GetWindowInfo(p,a,b,c,d) 
(p)->lpVtbl->GetWindowSize(p,a,b,c,d)
+#define ID3DPresent_GetWindowInfo(p,a,b,c,d) 
(p)->lpVtbl->GetWindowInfo(p,a,b,c,d)
 #define ID3DPresent_GetWindowOccluded(p) (p)->lpVtbl->GetWindowOccluded(p)
 #define ID3DPresent_ResolutionMismatch(p) (p)->lpVtbl->ResolutionMismatch(p)
 #define ID3DPresent_CreateThread(p,a,b) (p)->lpVtbl->CreateThread(p,a,b)
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 07/12] st/nine: Fix aliasing states for stateblocks

2018-10-24 Thread Axel Davy
If NINE_STATE_FF_MATERIAL is set, the stateblock will upload
its recorded materials matrix.
If NINE_STATE_FF_LIGHTING is set, the lighting set is uploaded.

These flags could be set by a NineDevice9_SetTransform call
or by setting some states related to ff, but that shouldn't trigger
these stateblock behaviours.

We don't need to follow the context states dirtied by render states.
NINE_STATE_FF_VSTRANSF is exactly the state controlling stateblock
updates of transformation matrices, NINE_STATE_FF is too broad.

These two changes avoid setting the two mentionned states when we
shouldn't.

Fixes: https://github.com/iXit/Mesa-3D/issues/320

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/device9.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/gallium/state_trackers/nine/device9.c 
b/src/gallium/state_trackers/nine/device9.c
index 461b212995b..1a3f2c3285b 100644
--- a/src/gallium/state_trackers/nine/device9.c
+++ b/src/gallium/state_trackers/nine/device9.c
@@ -2014,7 +2014,7 @@ NineDevice9_SetTransform( struct NineDevice9 *This,
 *M = *pMatrix;
 if (unlikely(This->is_recording)) {
 state->ff.changed.transform[State / 32] |= 1 << (State % 32);
-state->changed.group |= NINE_STATE_FF;
+state->changed.group |= NINE_STATE_FF_VSTRANSF;
 } else
 nine_context_set_transform(This, State, pMatrix);
 
@@ -2261,7 +2261,6 @@ NineDevice9_SetRenderState( struct NineDevice9 *This,
 state->rs_advertised[State] = Value;
 /* only need to record changed render states for stateblocks */
 state->changed.rs[State / 32] |= 1 << (State % 32);
-state->changed.group |= nine_render_state_group[State];
 return D3D_OK;
 }
 
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 08/12] st/nine: Do not set unused states for stateblocks

2018-10-24 Thread Axel Davy
A lot of these states are used only for the context,
and are unused for stateblocks (which just uses the
changed.* fields instead for a lot of them).

Signed-off-by: Axel Davy 
---
Before we implemented csmt, which separated the 'context' states and the
application visible states + the stateblocks, setting these states
was required to have the context states update properly.
Now they live in a different world.
A valid complaint for this patchset would be that it would be less
confusing to rename all NINE_STATE_* flags, such that the context
would use states of different names than stateblocks to reduce confusion.
I'm open for discussion, and may do this renaming in a future patch if
convinced.
 src/gallium/state_trackers/nine/device9.c | 24 +++
 1 file changed, 3 insertions(+), 21 deletions(-)

diff --git a/src/gallium/state_trackers/nine/device9.c 
b/src/gallium/state_trackers/nine/device9.c
index 1a3f2c3285b..ae8733027e8 100644
--- a/src/gallium/state_trackers/nine/device9.c
+++ b/src/gallium/state_trackers/nine/device9.c
@@ -2315,9 +2315,7 @@ NineDevice9_CreateStateBlock( struct NineDevice9 *This,
 *ppSB = (IDirect3DStateBlock9 *)nsb;
 dst = >state;
 
-dst->changed.group =
-   NINE_STATE_TEXTURE |
-   NINE_STATE_SAMPLER;
+dst->changed.group = NINE_STATE_SAMPLER;
 
 if (Type == D3DSBT_ALL || Type == D3DSBT_VERTEXSTATE) {
dst->changed.group |=
@@ -2350,10 +2348,7 @@ NineDevice9_CreateStateBlock( struct NineDevice9 *This,
 }
 if (Type == D3DSBT_ALL || Type == D3DSBT_PIXELSTATE) {
dst->changed.group |=
-  NINE_STATE_PS | NINE_STATE_PS_CONST | NINE_STATE_BLEND |
-  NINE_STATE_FF_VS_OTHER | NINE_STATE_FF_PS_CONSTS | 
NINE_STATE_PS_CONST |
-  NINE_STATE_FB | NINE_STATE_DSA | NINE_STATE_MULTISAMPLE |
-  NINE_STATE_RASTERIZER | NINE_STATE_STENCIL_REF;
+  NINE_STATE_PS | NINE_STATE_PS_CONST | NINE_STATE_FF_PS_CONSTS;
memcpy(dst->changed.rs,
   nine_render_states_pixel, sizeof(dst->changed.rs));
nine_ranges_insert(>changed.ps_const_f, 0, This->max_ps_const_f,
@@ -2371,13 +2366,8 @@ NineDevice9_CreateStateBlock( struct NineDevice9 *This,
dst->changed.group |=
   NINE_STATE_VIEWPORT |
   NINE_STATE_SCISSOR |
-  NINE_STATE_RASTERIZER |
-  NINE_STATE_BLEND |
-  NINE_STATE_DSA |
   NINE_STATE_IDXBUF |
   NINE_STATE_FF_MATERIAL |
-  NINE_STATE_BLEND_COLOR |
-  NINE_STATE_SAMPLE_MASK |
   NINE_STATE_FF_VSTRANSF;
memset(dst->changed.rs, ~0, (D3DRS_COUNT / 32) * sizeof(uint32_t));
dst->changed.rs[D3DRS_LAST / 32] |= (1 << (D3DRS_COUNT % 32)) - 1;
@@ -2500,7 +2490,6 @@ NineDevice9_SetTexture( struct NineDevice9 *This,
 
 if (This->is_recording) {
 state->changed.texture |= 1 << Stage;
-state->changed.group |= NINE_STATE_TEXTURE;
 nine_bind(>texture[Stage], pTexture);
 return D3D_OK;
 }
@@ -2549,8 +2538,6 @@ NineDevice9_SetTextureStageState( struct NineDevice9 
*This,
 state->ff.tex_stage[Stage][Type] = Value;
 
 if (unlikely(This->is_recording)) {
-if (Type == D3DTSS_TEXTURETRANSFORMFLAGS)
-state->changed.group |= NINE_STATE_PS_PARAMS_MISC;
 state->changed.group |= NINE_STATE_FF_PS_CONSTS;
 state->ff.changed.tex_stage[Stage][Type / 32] |= 1 << (Type % 32);
 } else
@@ -3544,8 +3531,6 @@ NineDevice9_SetStreamSourceFreq( struct NineDevice9 *This,
 if (unlikely(This->is_recording)) {
 state->stream_freq[StreamNumber] = Setting;
 state->changed.stream_freq |= 1 << StreamNumber;
-if (StreamNumber != 0)
-state->changed.group |= NINE_STATE_STREAMFREQ;
 return D3D_OK;
 }
 
@@ -3634,11 +3619,8 @@ NineDevice9_SetPixelShader( struct NineDevice9 *This,
 DBG("This=%p pShader=%p\n", This, pShader);
 
 if (unlikely(This->is_recording)) {
-/* Technically we need NINE_STATE_FB only
- * if the ps mask changes, but put it always
- * to be safe */
 nine_bind(>ps, pShader);
-state->changed.group |= NINE_STATE_PS | NINE_STATE_FB;
+state->changed.group |= NINE_STATE_PS;
 return D3D_OK;
 }
 
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10/12] st/nine: Reduce MaxSimultaneousTextures to 8

2018-10-24 Thread Axel Davy
Windows drivers don't set this flag (which affects ff) to more than 8.

Do the same in case some games check for 8.

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/adapter9.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/state_trackers/nine/adapter9.c 
b/src/gallium/state_trackers/nine/adapter9.c
index 2fa92e4207b..ec18d21a94d 100644
--- a/src/gallium/state_trackers/nine/adapter9.c
+++ b/src/gallium/state_trackers/nine/adapter9.c
@@ -791,8 +791,8 @@ NineAdapter9_GetDeviceCaps( struct NineAdapter9 *This,
 (DWORD)screen->get_param(screen, PIPE_CAP_BLEND_EQUATION_SEPARATE);
 pCaps->MaxSimultaneousTextures = screen->get_shader_param(screen,
 PIPE_SHADER_FRAGMENT, PIPE_SHADER_CAP_MAX_TEXTURE_SAMPLERS);
-if (pCaps->MaxSimultaneousTextures > NINE_MAX_SAMPLERS_PS)
-pCaps->MaxSimultaneousTextures = NINE_MAX_SAMPLERS_PS;
+if (pCaps->MaxSimultaneousTextures > 8)
+pCaps->MaxSimultaneousTextures = 8;
 
 pCaps->VertexProcessingCaps = D3DVTXPCAPS_TEXGEN |
   D3DVTXPCAPS_TEXGEN_SPHEREMAP |
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 09/12] st/nine: Enable shadow mapping for ps 1.X

2018-10-24 Thread Axel Davy
We didn't implement shadow textures for ps 1.X,
assuming the case couldn't happen...
Well it does.

Fixes: https://github.com/iXit/Mesa-3D/issues/261

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/nine_shader.c  |  8 +---
 src/gallium/state_trackers/nine/pixelshader9.c |  2 +-
 src/gallium/state_trackers/nine/pixelshader9.h | 14 --
 3 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/src/gallium/state_trackers/nine/nine_shader.c 
b/src/gallium/state_trackers/nine/nine_shader.c
index 2b11958b261..145647bc3f8 100644
--- a/src/gallium/state_trackers/nine/nine_shader.c
+++ b/src/gallium/state_trackers/nine/nine_shader.c
@@ -771,12 +771,13 @@ TEX_with_ps1x_projection(struct shader_translator *tx, 
struct ureg_dst dst,
 {
 unsigned dim = 1 + ((tx->info->projected >> (2 * idx)) & 3);
 struct ureg_dst tmp;
+boolean shadow = !!(tx->info->sampler_mask_shadow & (1 << idx));
 
 /* dim == 1: no projection
  * Looks like must be disabled when it makes no
  * sense according the texture dimensions
  */
-if (dim == 1 || dim <= target) {
+if (dim == 1 || (dim <= target && !shadow)) {
 ureg_TEX(tx->ureg, dst, target, src0, src1);
 } else if (dim == 4) {
 ureg_TXP(tx->ureg, dst, target, src0, src1);
@@ -2107,9 +2108,10 @@ d3dstt_to_tgsi_tex_shadow(BYTE sampler_type)
 static inline unsigned
 ps1x_sampler_type(const struct nine_shader_info *info, unsigned stage)
 {
+boolean shadow = !!(info->sampler_mask_shadow & (1 << stage));
 switch ((info->sampler_ps1xtypes >> (stage * 2)) & 0x3) {
-case 1: return TGSI_TEXTURE_1D;
-case 0: return TGSI_TEXTURE_2D;
+case 1: return shadow ? TGSI_TEXTURE_SHADOW1D : TGSI_TEXTURE_1D;
+case 0: return shadow ? TGSI_TEXTURE_SHADOW2D : TGSI_TEXTURE_2D;
 case 3: return TGSI_TEXTURE_3D;
 default:
 return TGSI_TEXTURE_CUBE;
diff --git a/src/gallium/state_trackers/nine/pixelshader9.c 
b/src/gallium/state_trackers/nine/pixelshader9.c
index 6f053f709bf..5d79019a1bc 100644
--- a/src/gallium/state_trackers/nine/pixelshader9.c
+++ b/src/gallium/state_trackers/nine/pixelshader9.c
@@ -164,7 +164,7 @@ NinePixelShader9_GetVariant( struct NinePixelShader9 *This )
 info.const_b_base = NINE_CONST_B_BASE(device->max_ps_const_f) / 16;
 info.byte_code = This->byte_code.tokens;
 info.sampler_mask_shadow = key & 0x;
-info.sampler_ps1xtypes = key;
+info.sampler_ps1xtypes = (key >> 16) & 0x;
 info.fog_enable = device->context.rs[D3DRS_FOGENABLE];
 info.fog_mode = device->context.rs[D3DRS_FOGTABLEMODE];
 info.force_color_in_centroid = key >> 34 & 1;
diff --git a/src/gallium/state_trackers/nine/pixelshader9.h 
b/src/gallium/state_trackers/nine/pixelshader9.h
index accd00a6a8c..bcbadd71057 100644
--- a/src/gallium/state_trackers/nine/pixelshader9.h
+++ b/src/gallium/state_trackers/nine/pixelshader9.h
@@ -68,13 +68,16 @@ NinePixelShader9_UpdateKey( struct NinePixelShader9 *ps,
 struct nine_context *context )
 {
 uint16_t samplers_shadow;
-uint32_t samplers_ps1_types;
+uint16_t samplers_ps1_types;
 uint16_t projected;
 uint64_t key;
 BOOL res;
 
+samplers_shadow = (uint16_t)((context->samplers_shadow & 
NINE_PS_SAMPLERS_MASK) >> NINE_SAMPLER_PS(0));
+key = samplers_shadow & ps->sampler_mask;
+
 if (unlikely(ps->byte_code.version < 0x20)) {
-/* no depth textures, but variable targets */
+/* variable targets */
 uint32_t m = ps->sampler_mask;
 samplers_ps1_types = 0;
 while (m) {
@@ -82,10 +85,9 @@ NinePixelShader9_UpdateKey( struct NinePixelShader9 *ps,
 m &= ~(1 << s);
 samplers_ps1_types |= (context->texture[s].enabled ? 
context->texture[s].pstype : 1) << (s * 2);
 }
-key = samplers_ps1_types;
-} else {
-samplers_shadow = (uint16_t)((context->samplers_shadow & 
NINE_PS_SAMPLERS_MASK) >> NINE_SAMPLER_PS(0));
-key = samplers_shadow & ps->sampler_mask;
+/* Note: For ps 1.X, only samplers 0 1 2 and 3 are available (except 
1.4 where 4 and 5 are available).
+ * Thus there is no overflow of samplers_ps1_types. */
+key |= samplers_ps1_types << 16;
 }
 
 if (ps->byte_code.version < 0x30) {
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 05/12] st/nine: Capture also default matrices for D3DSBT_ALL

2018-10-24 Thread Axel Davy
We avoid allocating space for never unused matrices.
However we must do as if we had captured them.
Thus when a D3DSBT_ALL stateblock apply has fewer matrices
than device state, allocate the default matrices for the stateblock
before applying.

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/nine_state.c  | 37 ---
 src/gallium/state_trackers/nine/nine_state.h  |  3 ++
 src/gallium/state_trackers/nine/stateblock9.c | 25 -
 3 files changed, 41 insertions(+), 24 deletions(-)

diff --git a/src/gallium/state_trackers/nine/nine_state.c 
b/src/gallium/state_trackers/nine/nine_state.c
index f4d9b423510..569a1b47292 100644
--- a/src/gallium/state_trackers/nine/nine_state.c
+++ b/src/gallium/state_trackers/nine/nine_state.c
@@ -3398,14 +3398,31 @@ const uint32_t nine_render_state_group[NINED3DRS_LAST + 
1] =
 
 /* Misc */
 
+static D3DMATRIX nine_state_identity = { .m[0] = { 1, 0, 0, 0 },
+ .m[1] = { 0, 1, 0, 0 },
+ .m[2] = { 0, 0, 1, 0 },
+ .m[3] = { 0, 0, 0, 1 } };
+
+void
+nine_state_resize_transform(struct nine_ff_state *ff_state, unsigned N)
+{
+unsigned n = ff_state->num_transforms;
+
+if (N <= n)
+return;
+
+ff_state->transform = REALLOC(ff_state->transform,
+  n * sizeof(D3DMATRIX),
+  N * sizeof(D3DMATRIX));
+for (; n < N; ++n)
+ff_state->transform[n] = nine_state_identity;
+ff_state->num_transforms = N;
+}
+
 D3DMATRIX *
 nine_state_access_transform(struct nine_ff_state *ff_state, 
D3DTRANSFORMSTATETYPE t,
 boolean alloc)
 {
-static D3DMATRIX Identity = { .m[0] = { 1, 0, 0, 0 },
-  .m[1] = { 0, 1, 0, 0 },
-  .m[2] = { 0, 0, 1, 0 },
-  .m[3] = { 0, 0, 0, 1 } };
 unsigned index;
 
 switch (t) {
@@ -3427,17 +3444,9 @@ nine_state_access_transform(struct nine_ff_state 
*ff_state, D3DTRANSFORMSTATETYP
 }
 
 if (index >= ff_state->num_transforms) {
-unsigned N = index + 1;
-unsigned n = ff_state->num_transforms;
-
 if (!alloc)
-return 
-ff_state->transform = REALLOC(ff_state->transform,
-  n * sizeof(D3DMATRIX),
-  N * sizeof(D3DMATRIX));
-for (; n < N; ++n)
-ff_state->transform[n] = Identity;
-ff_state->num_transforms = N;
+return _state_identity;
+nine_state_resize_transform(ff_state, index + 1);
 }
 return _state->transform[index];
 }
diff --git a/src/gallium/state_trackers/nine/nine_state.h 
b/src/gallium/state_trackers/nine/nine_state.h
index 7c4517b3fef..55ccfd0f519 100644
--- a/src/gallium/state_trackers/nine/nine_state.h
+++ b/src/gallium/state_trackers/nine/nine_state.h
@@ -609,6 +609,9 @@ void nine_state_prepare_draw_sw(struct NineDevice9 *device,
 void nine_state_after_draw_sw(struct NineDevice9 *device);
 void nine_state_destroy_sw(struct NineDevice9 *device);
 
+void
+nine_state_resize_transform(struct nine_ff_state *ff_state, unsigned N);
+
 /* If @alloc is FALSE, the return value may be a const identity matrix.
  * Therefore, do not modify if you set alloc to FALSE !
  */
diff --git a/src/gallium/state_trackers/nine/stateblock9.c 
b/src/gallium/state_trackers/nine/stateblock9.c
index ebfd622ff91..7b2deae7f9b 100644
--- a/src/gallium/state_trackers/nine/stateblock9.c
+++ b/src/gallium/state_trackers/nine/stateblock9.c
@@ -357,8 +357,7 @@ nine_state_copy_common(struct NineDevice9 *device,
 if (!(mask->ff.changed.transform[i] & (1 << (s % 32
 continue;
 *nine_state_access_transform(>ff, s, TRUE) =
-*nine_state_access_transform( /* const because !alloc */
-(struct nine_ff_state *)>ff, s, FALSE);
+*nine_state_access_transform(>ff, s, FALSE);
 }
 if (apply)
 dst->ff.changed.transform[i] |= mask->ff.changed.transform[i];
@@ -369,7 +368,7 @@ nine_state_copy_common(struct NineDevice9 *device,
 static void
 nine_state_copy_common_all(struct NineDevice9 *device,
struct nine_state *dst,
-   const struct nine_state *src,
+   struct nine_state *src,
struct nine_state *help,
const boolean apply,
struct nine_range_pool *pool,
@@ -488,15 +487,21 @@ nine_state_copy_common_all(struct NineDevice9 *device,
 
 /* Transforms. */
 if (1) {
-if (dst->ff.num_transforms < src->ff.num_transforms) {
-dst->ff.transform = REALLOC

[Mesa-dev] [PATCH 06/12] st/nine: Never update device changed.* fields

2018-10-24 Thread Axel Davy
The device state changed.* field are never used.
These fields are used only for stateblocks.

Avoid setting them at all for clarity.

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/device9.c |  4 +-
 src/gallium/state_trackers/nine/nine_state.c  |  7 +-
 src/gallium/state_trackers/nine/nine_state.h  |  2 +-
 src/gallium/state_trackers/nine/stateblock9.c | 94 ++-
 4 files changed, 59 insertions(+), 48 deletions(-)

diff --git a/src/gallium/state_trackers/nine/device9.c 
b/src/gallium/state_trackers/nine/device9.c
index 25a8172b3fd..461b212995b 100644
--- a/src/gallium/state_trackers/nine/device9.c
+++ b/src/gallium/state_trackers/nine/device9.c
@@ -2178,9 +2178,11 @@ NineDevice9_LightEnable( struct NineDevice9 *This,
 NineDevice9_SetLight(This, Index, );
 }
 
-nine_state_light_enable(>ff, >changed.group, Index, Enable);
+nine_state_light_enable(>ff, Index, Enable);
 if (likely(!This->is_recording))
 nine_context_light_enable(This, Index, Enable);
+else
+state->changed.group |= NINE_STATE_FF_LIGHTING;
 
 return D3D_OK;
 }
diff --git a/src/gallium/state_trackers/nine/nine_state.c 
b/src/gallium/state_trackers/nine/nine_state.c
index 569a1b47292..74aaf57a549 100644
--- a/src/gallium/state_trackers/nine/nine_state.c
+++ b/src/gallium/state_trackers/nine/nine_state.c
@@ -1827,7 +1827,8 @@ CSMT_ITEM_NO_WAIT(nine_context_light_enable,
 {
 struct nine_context *context = >context;
 
-nine_state_light_enable(>ff, >changed.group, Index, 
Enable);
+nine_state_light_enable(>ff, Index, Enable);
+context->changed.group |= NINE_STATE_FF_LIGHTING;
 }
 
 CSMT_ITEM_NO_WAIT(nine_context_set_texture_stage_state,
@@ -3480,7 +3481,7 @@ nine_state_set_light(struct nine_ff_state *ff_state, 
DWORD Index,
 }
 
 HRESULT
-nine_state_light_enable(struct nine_ff_state *ff_state, uint32_t *change_group,
+nine_state_light_enable(struct nine_ff_state *ff_state,
 DWORD Index, BOOL Enable)
 {
 unsigned i;
@@ -3509,8 +3510,6 @@ nine_state_light_enable(struct nine_ff_state *ff_state, 
uint32_t *change_group,
 ff_state->active_light[i] = ff_state->active_light[i + 1];
 }
 
-*change_group |= NINE_STATE_FF_LIGHTING;
-
 return D3D_OK;
 }
 
diff --git a/src/gallium/state_trackers/nine/nine_state.h 
b/src/gallium/state_trackers/nine/nine_state.h
index 55ccfd0f519..51e5e326527 100644
--- a/src/gallium/state_trackers/nine/nine_state.h
+++ b/src/gallium/state_trackers/nine/nine_state.h
@@ -623,7 +623,7 @@ HRESULT
 nine_state_set_light(struct nine_ff_state *, DWORD, const D3DLIGHT9 *);
 
 HRESULT
-nine_state_light_enable(struct nine_ff_state *, uint32_t *,
+nine_state_light_enable(struct nine_ff_state *,
 DWORD, BOOL);
 
 const char *nine_d3drs_to_string(DWORD State);
diff --git a/src/gallium/state_trackers/nine/stateblock9.c 
b/src/gallium/state_trackers/nine/stateblock9.c
index 7b2deae7f9b..50ed70aec3a 100644
--- a/src/gallium/state_trackers/nine/stateblock9.c
+++ b/src/gallium/state_trackers/nine/stateblock9.c
@@ -134,8 +134,15 @@ nine_state_copy_common(struct NineDevice9 *device,
 unsigned i, s;
 
 DBG("apply:%d changed.group: %x\n", (int)apply, (int)mask->changed.group );
-if (apply)
-   dst->changed.group |= mask->changed.group;
+
+/* device changed.* are unused.
+ * Instead nine_context_apply_stateblock is used and will
+ * internally set the right context->changed fields.
+ * Uncomment these only if we want to apply a stateblock onto a stateblock.
+ *
+ * if (apply)
+ * dst->changed.group |= mask->changed.group;
+ */
 
 if (mask->changed.group & NINE_STATE_VIEWPORT)
 dst->viewport = src->viewport;
@@ -202,10 +209,10 @@ nine_state_copy_common(struct NineDevice9 *device,
 /* Render states.
  * TODO: Maybe build a list ?
  */
-for (i = 0; i < ARRAY_SIZE(dst->changed.rs); ++i) {
+for (i = 0; i < ARRAY_SIZE(mask->changed.rs); ++i) {
 uint32_t m = mask->changed.rs[i];
-if (apply)
-dst->changed.rs[i] |= m;
+/* if (apply)
+ * dst->changed.rs[i] |= m; */
 while (m) {
 const int r = ffs(m) - 1;
 m &= ~(1 << r);
@@ -222,8 +229,8 @@ nine_state_copy_common(struct NineDevice9 *device,
 if (mask->changed.ucp & (1 << i))
 memcpy(dst->clip.ucp[i],
src->clip.ucp[i], sizeof(src->clip.ucp[0]));
-if (apply)
-   dst->changed.ucp |= mask->changed.ucp;
+/* if (apply)
+ *dst->changed.ucp |= mask->changed.ucp;*/
 }
 
 /* Sampler state. */
@@ -240,8 +247,8 @@ nine_state_copy_common(struct NineDevice9 *device,
 dst->samp_advertised[s][i] = src->samp_advertised[s][i];
 

[Mesa-dev] [PATCH 03/12] st/nine: Don't update unused world matrices

2018-10-24 Thread Axel Davy
While to the application we have to track
accurately all 256 world matrices (including
in stateblocks), hw vertex processing enables
to set a limit to the number of world matrices
the hardware can access to in the advertised caps,
which is 8 for nine.

Thus don't bother in the stateblock code to send
the updated values for the unreachable matrices.

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/nine_state.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/gallium/state_trackers/nine/nine_state.c 
b/src/gallium/state_trackers/nine/nine_state.c
index c9901209189..f4d9b423510 100644
--- a/src/gallium/state_trackers/nine/nine_state.c
+++ b/src/gallium/state_trackers/nine/nine_state.c
@@ -2059,6 +2059,12 @@ nine_context_apply_stateblock(struct NineDevice9 *device,
 for (s = i * 32; s < (i * 32 + 32); ++s) {
 if (!(src->ff.changed.transform[i] & (1 << (s % 32
 continue;
+/* MaxVertexBlendMatrixIndex is 8, which means
+ * we don't read past index D3DTS_WORLDMATRIX(8).
+ * swvp is supposed to allow all 256, but we don't
+ * implement it for now. */
+if (s > D3DTS_WORLDMATRIX(8))
+break;
 nine_context_set_transform(device, s,
nine_state_access_transform(
(struct nine_ff_state 
*)>ff,
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 02/12] st/nine: Remove two unused states.

2018-10-24 Thread Axel Davy
NINE_STATE_MATERIAL was used incorrectly at one location.
Replace it with the correct state.

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/device9.c| 2 +-
 src/gallium/state_trackers/nine/nine_state.h | 2 --
 2 files changed, 1 insertion(+), 3 deletions(-)

diff --git a/src/gallium/state_trackers/nine/device9.c 
b/src/gallium/state_trackers/nine/device9.c
index 51e49ac4303..3b174587a44 100644
--- a/src/gallium/state_trackers/nine/device9.c
+++ b/src/gallium/state_trackers/nine/device9.c
@@ -2374,7 +2374,7 @@ NineDevice9_CreateStateBlock( struct NineDevice9 *This,
   NINE_STATE_BLEND |
   NINE_STATE_DSA |
   NINE_STATE_IDXBUF |
-  NINE_STATE_MATERIAL |
+  NINE_STATE_FF_MATERIAL |
   NINE_STATE_BLEND_COLOR |
   NINE_STATE_SAMPLE_MASK;
memset(dst->changed.rs, ~0, (D3DRS_COUNT / 32) * sizeof(uint32_t));
diff --git a/src/gallium/state_trackers/nine/nine_state.h 
b/src/gallium/state_trackers/nine/nine_state.h
index a3cc66ef8b5..7c4517b3fef 100644
--- a/src/gallium/state_trackers/nine/nine_state.h
+++ b/src/gallium/state_trackers/nine/nine_state.h
@@ -70,8 +70,6 @@
 #define NINE_STATE_VDECL   (1 << 12)
 #define NINE_STATE_IDXBUF  (1 << 13)
 #define NINE_STATE_STREAMFREQ  (1 << 14)
-#define NINE_STATE_PRIM(1 << 15)
-#define NINE_STATE_MATERIAL(1 << 16)
 #define NINE_STATE_BLEND_COLOR (1 << 17)
 #define NINE_STATE_STENCIL_REF (1 << 18)
 #define NINE_STATE_SAMPLE_MASK (1 << 19)
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 01/12] st/nine: Remove commented nine_context_apply_stateblock

2018-10-24 Thread Axel Davy
At some point the project was to adapt the
commented version to csmt.

The csmt rework enabled to fix some state aliasing
issues between stateblocks and internal state updates.
The commented version needs a lot of work to work with that.
Just drop it.

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/nine_state.c | 230 ---
 1 file changed, 230 deletions(-)

diff --git a/src/gallium/state_trackers/nine/nine_state.c 
b/src/gallium/state_trackers/nine/nine_state.c
index 3db9a07fbf4..c9901209189 100644
--- a/src/gallium/state_trackers/nine/nine_state.c
+++ b/src/gallium/state_trackers/nine/nine_state.c
@@ -1893,236 +1893,6 @@ CSMT_ITEM_NO_WAIT(nine_context_set_swvp,
 context->changed.group |= NINE_STATE_SWVP;
 }
 
-#if 0
-
-void
-nine_context_apply_stateblock(struct NineDevice9 *device,
-  const struct nine_state *src)
-{
-struct nine_context *context = >context;
-int i;
-
-context->changed.group |= src->changed.group;
-
-for (i = 0; i < ARRAY_SIZE(src->changed.rs); ++i) {
-uint32_t m = src->changed.rs[i];
-while (m) {
-const int r = ffs(m) - 1;
-m &= ~(1 << r);
-context->rs[i * 32 + r] = nine_fix_render_state_value(i * 32 + r, 
src->rs_advertised[i * 32 + r]);
-}
-}
-
-/* Textures */
-if (src->changed.texture) {
-uint32_t m = src->changed.texture;
-unsigned s;
-
-for (s = 0; m; ++s, m >>= 1) {
-struct NineBaseTexture9 *tex = src->texture[s];
-if (!(m & 1))
-continue;
-nine_context_set_texture(device, s, tex);
-}
-}
-
-/* Sampler state */
-if (src->changed.group & NINE_STATE_SAMPLER) {
-unsigned s;
-
-for (s = 0; s < NINE_MAX_SAMPLERS; ++s) {
-uint32_t m = src->changed.sampler[s];
-while (m) {
-const int i = ffs(m) - 1;
-m &= ~(1 << i);
-if (nine_check_sampler_state_value(i, 
src->samp_advertised[s][i]))
-context->samp[s][i] = src->samp_advertised[s][i];
-}
-context->changed.sampler[s] |= src->changed.sampler[s];
-}
-}
-
-/* Vertex buffers */
-if (src->changed.vtxbuf | src->changed.stream_freq) {
-uint32_t m = src->changed.vtxbuf | src->changed.stream_freq;
-for (i = 0; m; ++i, m >>= 1) {
-if (src->changed.vtxbuf & (1 << i)) {
-if (src->stream[i]) {
-unsigned offset = 0;
-pipe_resource_reference(>vtxbuf[i].buffer,
-src->stream[i] ? 
NineVertexBuffer9_GetResource(src->stream[i], ) : NULL);
-context->vtxbuf[i].buffer_offset = 
src->vtxbuf[i].buffer_offset + offset;
-context->vtxbuf[i].stride = src->vtxbuf[i].stride;
-}
-}
-if (src->changed.stream_freq & (1 << i)) {
-context->stream_freq[i] = src->stream_freq[i];
-if (src->stream_freq[i] & D3DSTREAMSOURCE_INSTANCEDATA)
-context->stream_instancedata_mask |= 1 << i;
-else
-context->stream_instancedata_mask &= ~(1 << i);
-}
-}
-context->changed.vtxbuf |= src->changed.vtxbuf;
-}
-
-/* Index buffer */
-if (src->changed.group & NINE_STATE_IDXBUF)
-nine_context_set_indices(device, src->idxbuf);
-
-/* Vertex declaration */
-if ((src->changed.group & NINE_STATE_VDECL) && src->vdecl)
-nine_context_set_vertex_declaration(device, src->vdecl);
-
-/* Vertex shader */
-if (src->changed.group & NINE_STATE_VS)
-nine_bind(>vs, src->vs);
-
-context->programmable_vs = context->vs && !(context->vdecl && 
context->vdecl->position_t);
-
-/* Pixel shader */
-if (src->changed.group & NINE_STATE_PS)
-nine_bind(>ps, src->ps);
-
-/* Vertex constants */
-if (src->changed.group & NINE_STATE_VS_CONST) {
-struct nine_range *r;
-if (device->may_swvp) {
-for (r = src->changed.vs_const_f; r; r = r->next) {
-int bgn = r->bgn;
-int end = r->end;
-memcpy(>vs_const_f_swvp[bgn * 4],
-   >vs_const_f[bgn * 4],
-   (end - bgn) * 4 * sizeof(float));
-if (bgn < device->max_vs_const_f) {
-end = MIN2(end, device->max_vs_const_f);
-memcpy(>vs_const_f[bgn * 4],
-   >vs_const_f[bgn * 4],
-  

Re: [Mesa-dev] [Mesa-stable] [PATCH 22/22 v2] radeonsi: NaN should pass kill_if

2018-10-01 Thread Axel Davy

On 01/10/2018 10:13, Juan A. Suarez Romero wrote:

On Mon, 2018-09-24 at 20:21 -0400, Marek Olšák wrote:

Looks good to me.

Marek

This patch was nominated to stable, but I can't apply it on 18.2 because this
patch requires 98f777f9d9c ("radeonsi: remove fetch_args callbacks for ALU
instructions"), which was not nominated.

WDYT? Should both patches added to the 18.2 release, or just get them out of the
release?

J.A.



Hi,

As Marek is the maintainer for radeonsi, I let him choose.


Yours,


Axel Davy





On Mon, Sep 24, 2018 at 2:29 AM, Axel Davy  wrote:

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=105333
Fixes: https://github.com/iXit/Mesa-3D/issues/314

For this application, NaN is passed to KILL_IF and is expected to
pass.

v2: Explain in the code why UGE is used.

Signed-off-by: Axel Davy 
Reviewed-by: Marek Olšák 

CC: 
---
  src/gallium/drivers/radeonsi/si_shader_tgsi_alu.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader_tgsi_alu.c 
b/src/gallium/drivers/radeonsi/si_shader_tgsi_alu.c
index f54d025aec0..a768b449047 100644
--- a/src/gallium/drivers/radeonsi/si_shader_tgsi_alu.c
+++ b/src/gallium/drivers/radeonsi/si_shader_tgsi_alu.c
@@ -60,7 +60,8 @@ static void kil_emit(const struct lp_build_tgsi_action 
*action,

 for (i = 0; i < TGSI_NUM_CHANNELS; i++) {
 LLVMValueRef value = lp_build_emit_fetch(bld_base, 
inst, 0, i);
-   conds[i] = LLVMBuildFCmp(builder, LLVMRealOGE, value,
+   /* UGE because NaN shouldn't get killed */
+   conds[i] = LLVMBuildFCmp(builder, LLVMRealUGE, value,
 ctx->ac.f32_0, "");
 }

--
2.18.0

___
mesa-stable mailing list
mesa-sta...@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-stable

___
mesa-stable mailing list
mesa-sta...@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-stable



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 22/22 v2] radeonsi: NaN should pass kill_if

2018-09-24 Thread Axel Davy
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=105333
Fixes: https://github.com/iXit/Mesa-3D/issues/314

For this application, NaN is passed to KILL_IF and is expected to
pass.

v2: Explain in the code why UGE is used.

Signed-off-by: Axel Davy 
Reviewed-by: Marek Olšák 

CC: 
---
 src/gallium/drivers/radeonsi/si_shader_tgsi_alu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader_tgsi_alu.c 
b/src/gallium/drivers/radeonsi/si_shader_tgsi_alu.c
index f54d025aec0..a768b449047 100644
--- a/src/gallium/drivers/radeonsi/si_shader_tgsi_alu.c
+++ b/src/gallium/drivers/radeonsi/si_shader_tgsi_alu.c
@@ -60,7 +60,8 @@ static void kil_emit(const struct lp_build_tgsi_action 
*action,
 
for (i = 0; i < TGSI_NUM_CHANNELS; i++) {
LLVMValueRef value = lp_build_emit_fetch(bld_base, 
inst, 0, i);
-   conds[i] = LLVMBuildFCmp(builder, LLVMRealOGE, value,
+   /* UGE because NaN shouldn't get killed */
+   conds[i] = LLVMBuildFCmp(builder, LLVMRealUGE, value,
ctx->ac.f32_0, "");
}
 
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 20/22] st/nine: Capture also default matrices for D3DSBT_ALL

2018-09-23 Thread Axel Davy

It should be
last_index += D3DTS_WORLDMATRIX(0) - 10;

I drop this patch from the serie and patch 21/22 as they need more testing.

On 9/23/18 7:00 PM, Axel Davy wrote:

We avoid allocating space for never unused matrices.
However we must do as if we had captured them.
Thus when a D3DSBT_ALL stateblock apply has fewer matrices
than device state, allocate the default matrices for the stateblock
before applying.

Signed-off-by: Axel Davy 
---
  src/gallium/state_trackers/nine/stateblock9.c | 10 ++
  1 file changed, 10 insertions(+)

diff --git a/src/gallium/state_trackers/nine/stateblock9.c 
b/src/gallium/state_trackers/nine/stateblock9.c
index ebfd622ff91..fd6f5d55677 100644
--- a/src/gallium/state_trackers/nine/stateblock9.c
+++ b/src/gallium/state_trackers/nine/stateblock9.c
@@ -494,6 +494,16 @@ nine_state_copy_common_all(struct NineDevice9 *device,
  src->ff.num_transforms * sizeof(src->ff.transform[0]));
  dst->ff.num_transforms = src->ff.num_transforms;
  }
+/* Alloc and init missing transforms */
+if (dst->ff.num_transforms > src->ff.num_transforms) {
+int last_index = dst->ff.num_transforms - 1;
+/* There a hole in the indices we fill */
+if (last_index >= 10)
+last_index += D3DTS_WORLDMATRIX(0);
+(void) nine_state_access_transform((struct nine_ff_state 
*)>ff,
+   last_index,
+   TRUE);
+}
  memcpy(dst->ff.transform,
 src->ff.transform, src->ff.num_transforms * sizeof(D3DMATRIX));
  if (apply) /* TODO: memset */



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 21/22] st/nine: Mark transform matrices dirty for D3DSBT_ALL

2018-09-23 Thread Axel Davy
D3DSBT_ALL stateblocks capture the transform matrices.

Fixes some d3d test programs not displaying properly.

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/device9.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/nine/device9.c 
b/src/gallium/state_trackers/nine/device9.c
index 51e49ac4303..a051f6b62fa 100644
--- a/src/gallium/state_trackers/nine/device9.c
+++ b/src/gallium/state_trackers/nine/device9.c
@@ -2376,13 +2376,24 @@ NineDevice9_CreateStateBlock( struct NineDevice9 *This,
   NINE_STATE_IDXBUF |
   NINE_STATE_MATERIAL |
   NINE_STATE_BLEND_COLOR |
-  NINE_STATE_SAMPLE_MASK;
+  NINE_STATE_SAMPLE_MASK |
+  NINE_STATE_FF_VSTRANSF;
memset(dst->changed.rs, ~0, (D3DRS_COUNT / 32) * sizeof(uint32_t));
dst->changed.rs[D3DRS_LAST / 32] |= (1 << (D3DRS_COUNT % 32)) - 1;
dst->changed.vtxbuf = (1ULL << This->caps.MaxStreams) - 1;
dst->changed.stream_freq = dst->changed.vtxbuf;
dst->changed.ucp = (1 << PIPE_MAX_CLIP_PLANES) - 1;
dst->changed.texture = (1 << NINE_MAX_SAMPLERS) - 1;
+   /* The doc says the projection, world, view and texture matrices
+* are saved, which would translate to:
+* dst->ff.changed.transform[0] = 0x00FF000C;
+* dst->ff.changed.transform[D3DTS_WORLD / 32] |= 1 << (D3DTS_WORLD % 
32);
+* However we assume they meant save everything (which is basically 
just the
+* above plus the other world matrices).
+*/
+   dst->ff.changed.transform[0] = 0x00FF000C;
+   for (s = 0; s < 8; s++)
+   dst->ff.changed.transform[8+s] = ~0;
 }
 NineStateBlock9_Capture(NineStateBlock9(*ppSB));
 
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 22/22] radeonsi: NaN should pass kill_if

2018-09-23 Thread Axel Davy
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=105333
Fixes: https://github.com/iXit/Mesa-3D/issues/314

For this application, NaN is passed to KILL_IF and is expected to
pass.

Signed-off-by: Axel Davy 
---
 src/gallium/drivers/radeonsi/si_shader_tgsi_alu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader_tgsi_alu.c 
b/src/gallium/drivers/radeonsi/si_shader_tgsi_alu.c
index f54d025aec0..3469ad9ca67 100644
--- a/src/gallium/drivers/radeonsi/si_shader_tgsi_alu.c
+++ b/src/gallium/drivers/radeonsi/si_shader_tgsi_alu.c
@@ -60,7 +60,7 @@ static void kil_emit(const struct lp_build_tgsi_action 
*action,
 
for (i = 0; i < TGSI_NUM_CHANNELS; i++) {
LLVMValueRef value = lp_build_emit_fetch(bld_base, 
inst, 0, i);
-   conds[i] = LLVMBuildFCmp(builder, LLVMRealOGE, value,
+   conds[i] = LLVMBuildFCmp(builder, LLVMRealUGE, value,
ctx->ac.f32_0, "");
}
 
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 20/22] st/nine: Capture also default matrices for D3DSBT_ALL

2018-09-23 Thread Axel Davy
We avoid allocating space for never unused matrices.
However we must do as if we had captured them.
Thus when a D3DSBT_ALL stateblock apply has fewer matrices
than device state, allocate the default matrices for the stateblock
before applying.

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/stateblock9.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/src/gallium/state_trackers/nine/stateblock9.c 
b/src/gallium/state_trackers/nine/stateblock9.c
index ebfd622ff91..fd6f5d55677 100644
--- a/src/gallium/state_trackers/nine/stateblock9.c
+++ b/src/gallium/state_trackers/nine/stateblock9.c
@@ -494,6 +494,16 @@ nine_state_copy_common_all(struct NineDevice9 *device,
 src->ff.num_transforms * sizeof(src->ff.transform[0]));
 dst->ff.num_transforms = src->ff.num_transforms;
 }
+/* Alloc and init missing transforms */
+if (dst->ff.num_transforms > src->ff.num_transforms) {
+int last_index = dst->ff.num_transforms - 1;
+/* There a hole in the indices we fill */
+if (last_index >= 10)
+last_index += D3DTS_WORLDMATRIX(0);
+(void) nine_state_access_transform((struct nine_ff_state 
*)>ff,
+   last_index,
+   TRUE);
+}
 memcpy(dst->ff.transform,
src->ff.transform, src->ff.num_transforms * sizeof(D3DMATRIX));
 if (apply) /* TODO: memset */
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 18/22] st/nine: Split NINE_STATE_FF_OTHER

2018-09-23 Thread Axel Davy
NINE_STATE_FF_OTHER was mostly ff vs states.

Rename it to NINE_STATE_FF_VS_OTHER and
move common states with ps to
NINE_STATE_FF_PS_CONSTS (renamed from
NINE_STATE_FF_PSSTAGES).

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/device9.c |  4 +--
 src/gallium/state_trackers/nine/nine_ff.c |  4 +--
 src/gallium/state_trackers/nine/nine_state.c  | 30 +--
 src/gallium/state_trackers/nine/nine_state.h  |  6 ++--
 src/gallium/state_trackers/nine/stateblock9.c |  2 +-
 5 files changed, 23 insertions(+), 23 deletions(-)

diff --git a/src/gallium/state_trackers/nine/device9.c 
b/src/gallium/state_trackers/nine/device9.c
index 37fcba875ff..51e49ac4303 100644
--- a/src/gallium/state_trackers/nine/device9.c
+++ b/src/gallium/state_trackers/nine/device9.c
@@ -2350,7 +2350,7 @@ NineDevice9_CreateStateBlock( struct NineDevice9 *This,
 if (Type == D3DSBT_ALL || Type == D3DSBT_PIXELSTATE) {
dst->changed.group |=
   NINE_STATE_PS | NINE_STATE_PS_CONST | NINE_STATE_BLEND |
-  NINE_STATE_FF_OTHER | NINE_STATE_FF_PSSTAGES | NINE_STATE_PS_CONST |
+  NINE_STATE_FF_VS_OTHER | NINE_STATE_FF_PS_CONSTS | 
NINE_STATE_PS_CONST |
   NINE_STATE_FB | NINE_STATE_DSA | NINE_STATE_MULTISAMPLE |
   NINE_STATE_RASTERIZER | NINE_STATE_STENCIL_REF;
memcpy(dst->changed.rs,
@@ -2539,7 +2539,7 @@ NineDevice9_SetTextureStageState( struct NineDevice9 
*This,
 if (unlikely(This->is_recording)) {
 if (Type == D3DTSS_TEXTURETRANSFORMFLAGS)
 state->changed.group |= NINE_STATE_PS_PARAMS_MISC;
-state->changed.group |= NINE_STATE_FF_PSSTAGES;
+state->changed.group |= NINE_STATE_FF_PS_CONSTS;
 state->ff.changed.tex_stage[Stage][Type / 32] |= 1 << (Type % 32);
 } else
 nine_context_set_texture_stage_state(This, Stage, Type, Value);
diff --git a/src/gallium/state_trackers/nine/nine_ff.c 
b/src/gallium/state_trackers/nine/nine_ff.c
index 453f280c9fc..addea3dde1f 100644
--- a/src/gallium/state_trackers/nine/nine_ff.c
+++ b/src/gallium/state_trackers/nine/nine_ff.c
@@ -1949,7 +1949,7 @@ nine_ff_load_point_and_fog_params(struct NineDevice9 
*device)
 struct nine_context *context = >context;
 struct fvec4 *dst = (struct fvec4 *)device->ff.vs_const;
 
-if (!(context->changed.group & NINE_STATE_FF_OTHER))
+if (!(context->changed.group & NINE_STATE_FF_VS_OTHER))
 return;
 dst[26].x = asfloat(context->rs[D3DRS_POINTSIZE_MIN]);
 dst[26].y = asfloat(context->rs[D3DRS_POINTSIZE_MAX]);
@@ -1986,7 +1986,7 @@ nine_ff_load_ps_params(struct NineDevice9 *device)
 struct fvec4 *dst = (struct fvec4 *)device->ff.ps_const;
 unsigned s;
 
-if (!(context->changed.group & (NINE_STATE_FF_PSSTAGES | 
NINE_STATE_FF_OTHER)))
+if (!(context->changed.group & NINE_STATE_FF_PS_CONSTS))
 return;
 
 for (s = 0; s < 8; ++s)
diff --git a/src/gallium/state_trackers/nine/nine_state.c 
b/src/gallium/state_trackers/nine/nine_state.c
index cb62c28b7b7..3db9a07fbf4 100644
--- a/src/gallium/state_trackers/nine/nine_state.c
+++ b/src/gallium/state_trackers/nine/nine_state.c
@@ -1870,7 +1870,7 @@ CSMT_ITEM_NO_WAIT(nine_context_set_texture_stage_state,
 context->changed.group |= NINE_STATE_PS_CONST;
 }
 
-context->changed.group |= NINE_STATE_FF_PSSTAGES;
+context->changed.group |= NINE_STATE_FF_PS_CONSTS;
 context->ff.changed.tex_stage[Stage][Type / 32] |= 1 << (Type % 32);
 }
 
@@ -2073,7 +2073,7 @@ nine_context_apply_stateblock(struct NineDevice9 *device,
 if (src->changed.group & NINE_STATE_FF_MATERIAL)
 context->ff.material = src->ff.material;
 
-if (src->changed.group & NINE_STATE_FF_PSSTAGES) {
+if (src->changed.group & NINE_STATE_FF_PS_CONSTS) {
 unsigned s;
 for (s = 0; s < NINE_MAX_TEXTURE_STAGES; ++s) {
 for (i = 0; i < NINED3DTSS_COUNT; ++i)
@@ -2266,7 +2266,7 @@ nine_context_apply_stateblock(struct NineDevice9 *device,
 if (src->changed.group & NINE_STATE_FF_MATERIAL)
 nine_context_set_material(device, >ff.material);
 
-if (src->changed.group & NINE_STATE_FF_PSSTAGES) {
+if (src->changed.group & NINE_STATE_FF_PS_CONSTS) {
 unsigned s;
 for (s = 0; s < NINE_MAX_TEXTURE_STAGES; ++s) {
 for (i = 0; i < NINED3DTSS_COUNT; ++i)
@@ -3531,11 +3531,11 @@ const uint32_t nine_render_state_group[NINED3DRS_LAST + 
1] =
 [D3DRS_ALPHABLENDENABLE] = NINE_STATE_BLEND,
 [D3DRS_FOGENABLE] = NINE_STATE_FF_SHADER | NINE_STATE_VS_PARAMS_MISC | 
NINE_STATE_PS_PARAMS_MISC | NINE_STATE_PS_CONST,
 [D3DRS_SPECULARENABLE] = NINE_STATE_FF_LIGHTING,
-[D3DRS_FOGCOLOR] = NINE_STATE_FF_OTHER | NINE_STATE_PS_CONST,
+[D3DRS_FOGCOLOR] = NINE_STATE_FF_PS_CONSTS | NINE_STATE_PS_CONST,
 [D3DRS_FOGTABLEMODE] = NINE

[Mesa-dev] [PATCH 12/22] st/nine: Don't call SetCursor until a cursor is set

2018-09-23 Thread Axel Davy
The previous code was ignoring the input
until a cursor is set inside d3d
(with SetCursorProperties), as expected
by wine tests.

However it did still make a call to ID3DPresent_SetCursor,
which would result into a SetCursor(NULL) call, thus
hidding any cursor set outside d3d, which we shouldn't do.

Add comment about not avoiding redundant ID3DPresent_SetCursor
calls once a cursor has been set in d3d, as it has been tested to
cause regressions.

Fixes: https://github.com/iXit/Mesa-3D/issues/197

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/device9.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/nine/device9.c 
b/src/gallium/state_trackers/nine/device9.c
index b3e56d70b74..04f90ad8210 100644
--- a/src/gallium/state_trackers/nine/device9.c
+++ b/src/gallium/state_trackers/nine/device9.c
@@ -810,7 +810,14 @@ NineDevice9_ShowCursor( struct NineDevice9 *This,
 
 DBG("This=%p bShow=%d\n", This, (int) bShow);
 
-This->cursor.visible = bShow && (This->cursor.hotspot.x != -1);
+/* No-op until a cursor is set in d3d */
+if (This->cursor.hotspot.x == -1)
+return old;
+
+This->cursor.visible = bShow;
+/* Note: Don't optimize by avoiding the call if This->cursor.visible
+ * hasn't changed. One has to keep in mind the app may do SetCursor
+ * calls outside d3d, thus such an optimization affects behaviour. */
 if (!This->cursor.software)
 This->cursor.software = 
ID3DPresent_SetCursor(This->swapchains[0]->present, NULL, NULL, bShow) != 
D3D_OK;
 
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 16/22] st/nine: Mark pointsize states as ff states

2018-09-23 Thread Axel Davy
The pointsize states were missing the ff
NINE_STATE_FF_OTHER flag, and thus might
miss state updates when using ff.

Fixes some wine tests.

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/nine_state.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/gallium/state_trackers/nine/nine_state.c 
b/src/gallium/state_trackers/nine/nine_state.c
index 3ab90633d25..7e13feb83d6 100644
--- a/src/gallium/state_trackers/nine/nine_state.c
+++ b/src/gallium/state_trackers/nine/nine_state.c
@@ -3567,8 +3567,8 @@ const uint32_t nine_render_state_group[NINED3DRS_LAST + 
1] =
 [D3DRS_EMISSIVEMATERIALSOURCE] = NINE_STATE_FF_LIGHTING,
 [D3DRS_VERTEXBLEND] = NINE_STATE_FF_OTHER,
 [D3DRS_CLIPPLANEENABLE] = NINE_STATE_RASTERIZER,
-[D3DRS_POINTSIZE] = NINE_STATE_RASTERIZER,
-[D3DRS_POINTSIZE_MIN] = NINE_STATE_RASTERIZER | NINE_STATE_VS_PARAMS_MISC,
+[D3DRS_POINTSIZE] = NINE_STATE_RASTERIZER | NINE_STATE_FF_OTHER,
+[D3DRS_POINTSIZE_MIN] = NINE_STATE_RASTERIZER | NINE_STATE_FF_OTHER | 
NINE_STATE_VS_PARAMS_MISC,
 [D3DRS_POINTSPRITEENABLE] = NINE_STATE_RASTERIZER,
 [D3DRS_POINTSCALEENABLE] = NINE_STATE_FF_OTHER,
 [D3DRS_POINTSCALE_A] = NINE_STATE_FF_OTHER,
@@ -3578,7 +3578,7 @@ const uint32_t nine_render_state_group[NINED3DRS_LAST + 
1] =
 [D3DRS_MULTISAMPLEMASK] = NINE_STATE_SAMPLE_MASK,
 [D3DRS_PATCHEDGESTYLE] = NINE_STATE_UNHANDLED,
 [D3DRS_DEBUGMONITORTOKEN] = NINE_STATE_UNHANDLED,
-[D3DRS_POINTSIZE_MAX] = NINE_STATE_RASTERIZER | NINE_STATE_VS_PARAMS_MISC,
+[D3DRS_POINTSIZE_MAX] = NINE_STATE_RASTERIZER | NINE_STATE_FF_OTHER | 
NINE_STATE_VS_PARAMS_MISC,
 [D3DRS_INDEXEDVERTEXBLENDENABLE] = NINE_STATE_FF_OTHER,
 [D3DRS_COLORWRITEENABLE] = NINE_STATE_BLEND,
 [D3DRS_TWEENFACTOR] = NINE_STATE_FF_OTHER,
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 14/22] st/nine: Increase maximum number of temp registers

2018-09-23 Thread Axel Davy
With some test app I hit the limit.
As we allocate on demand (up to the maximum),
it is free to increase the limit.

Signed-off-by: Axel Davy 
CC: 
---
 src/gallium/state_trackers/nine/nine_shader.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/nine/nine_shader.c 
b/src/gallium/state_trackers/nine/nine_shader.c
index 5c33a6308c2..2b11958b261 100644
--- a/src/gallium/state_trackers/nine/nine_shader.c
+++ b/src/gallium/state_trackers/nine/nine_shader.c
@@ -487,7 +487,7 @@ struct shader_translator
 struct ureg_dst predicate_dst;
 struct ureg_dst tS[8]; /* texture stage registers */
 struct ureg_dst tdst; /* scratch dst if we need extra modifiers */
-struct ureg_dst t[5]; /* scratch TEMPs */
+struct ureg_dst t[8]; /* scratch TEMPs */
 struct ureg_src vC[2]; /* PS color in */
 struct ureg_src vT[8]; /* PS texcoord in */
 struct ureg_dst rL[NINE_MAX_LOOP_DEPTH]; /* loop ctr */
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 19/22] st/nine: Do not mark both ff vs and ps updated

2018-09-23 Thread Axel Davy
Previously if only ff vs or only ff ps was used,
the constants for both were marked as updated,
while only the constants of the used ff shader
were updated.

Now that NINE_STATE_FF_VS and
NINE_STATE_FF_PS do not intersect anymore,
we can correctly mark the correct set of constant
as updated.

Fixes: https://github.com/iXit/Mesa-3D/issues/319

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/nine_ff.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/gallium/state_trackers/nine/nine_ff.c 
b/src/gallium/state_trackers/nine/nine_ff.c
index addea3dde1f..261be276ad8 100644
--- a/src/gallium/state_trackers/nine/nine_ff.c
+++ b/src/gallium/state_trackers/nine/nine_ff.c
@@ -2066,6 +2066,8 @@ nine_ff_update(struct NineDevice9 *device)
 
 context->pipe_data.cb_vs_ff = cb;
 context->commit |= NINE_STATE_COMMIT_CONST_VS;
+
+context->changed.group &= ~NINE_STATE_FF_VS;
 }
 
 if (!context->ps) {
@@ -2078,9 +2080,9 @@ nine_ff_update(struct NineDevice9 *device)
 
 context->pipe_data.cb_ps_ff = cb;
 context->commit |= NINE_STATE_COMMIT_CONST_PS;
-}
 
-context->changed.group &= ~NINE_STATE_FF;
+context->changed.group &= ~NINE_STATE_FF_PS;
+}
 }
 
 
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 17/22] st/nine: Add dummy ff shader state

2018-09-23 Thread Axel Davy
Some states only affect the ff shader,
not its constants.
Currently we don't check anything and
always recompute the ff shader key.

However we do check for NINE_STATE_FF_OTHER
and if set we reupload some constants.

Thus for those states which had NINE_STATE_FF_OTHER
set but didn't need it,
replace by a dummy ff shader state (which is
easier to understand for an external reader than
just setting 0 and more future proof).

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/nine_state.c | 16 
 src/gallium/state_trackers/nine/nine_state.h |  4 
 2 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/src/gallium/state_trackers/nine/nine_state.c 
b/src/gallium/state_trackers/nine/nine_state.c
index 7e13feb83d6..cb62c28b7b7 100644
--- a/src/gallium/state_trackers/nine/nine_state.c
+++ b/src/gallium/state_trackers/nine/nine_state.c
@@ -3529,14 +3529,14 @@ const uint32_t nine_render_state_group[NINED3DRS_LAST + 
1] =
 [D3DRS_ALPHAFUNC] = NINE_STATE_DSA,
 [D3DRS_DITHERENABLE] = NINE_STATE_BLEND,
 [D3DRS_ALPHABLENDENABLE] = NINE_STATE_BLEND,
-[D3DRS_FOGENABLE] = NINE_STATE_FF_OTHER | NINE_STATE_VS_PARAMS_MISC | 
NINE_STATE_PS_PARAMS_MISC | NINE_STATE_PS_CONST,
+[D3DRS_FOGENABLE] = NINE_STATE_FF_SHADER | NINE_STATE_VS_PARAMS_MISC | 
NINE_STATE_PS_PARAMS_MISC | NINE_STATE_PS_CONST,
 [D3DRS_SPECULARENABLE] = NINE_STATE_FF_LIGHTING,
 [D3DRS_FOGCOLOR] = NINE_STATE_FF_OTHER | NINE_STATE_PS_CONST,
-[D3DRS_FOGTABLEMODE] = NINE_STATE_FF_OTHER | NINE_STATE_PS_PARAMS_MISC | 
NINE_STATE_PS_CONST,
+[D3DRS_FOGTABLEMODE] = NINE_STATE_FF_SHADER | NINE_STATE_PS_PARAMS_MISC | 
NINE_STATE_PS_CONST,
 [D3DRS_FOGSTART] = NINE_STATE_FF_OTHER | NINE_STATE_PS_CONST,
 [D3DRS_FOGEND] = NINE_STATE_FF_OTHER | NINE_STATE_PS_CONST,
 [D3DRS_FOGDENSITY] = NINE_STATE_FF_OTHER | NINE_STATE_PS_CONST,
-[D3DRS_RANGEFOGENABLE] = NINE_STATE_FF_OTHER,
+[D3DRS_RANGEFOGENABLE] = NINE_STATE_FF_SHADER,
 [D3DRS_STENCILENABLE] = NINE_STATE_DSA | NINE_STATE_MULTISAMPLE,
 [D3DRS_STENCILFAIL] = NINE_STATE_DSA,
 [D3DRS_STENCILZFAIL] = NINE_STATE_DSA,
@@ -3557,20 +3557,20 @@ const uint32_t nine_render_state_group[NINED3DRS_LAST + 
1] =
 [D3DRS_CLIPPING] = 0, /* software vertex processing only */
 [D3DRS_LIGHTING] = NINE_STATE_FF_LIGHTING,
 [D3DRS_AMBIENT] = NINE_STATE_FF_LIGHTING | NINE_STATE_FF_MATERIAL,
-[D3DRS_FOGVERTEXMODE] = NINE_STATE_FF_OTHER,
+[D3DRS_FOGVERTEXMODE] = NINE_STATE_FF_SHADER,
 [D3DRS_COLORVERTEX] = NINE_STATE_FF_LIGHTING,
 [D3DRS_LOCALVIEWER] = NINE_STATE_FF_LIGHTING,
-[D3DRS_NORMALIZENORMALS] = NINE_STATE_FF_OTHER,
+[D3DRS_NORMALIZENORMALS] = NINE_STATE_FF_SHADER,
 [D3DRS_DIFFUSEMATERIALSOURCE] = NINE_STATE_FF_LIGHTING,
 [D3DRS_SPECULARMATERIALSOURCE] = NINE_STATE_FF_LIGHTING,
 [D3DRS_AMBIENTMATERIALSOURCE] = NINE_STATE_FF_LIGHTING,
 [D3DRS_EMISSIVEMATERIALSOURCE] = NINE_STATE_FF_LIGHTING,
-[D3DRS_VERTEXBLEND] = NINE_STATE_FF_OTHER,
+[D3DRS_VERTEXBLEND] = NINE_STATE_FF_SHADER,
 [D3DRS_CLIPPLANEENABLE] = NINE_STATE_RASTERIZER,
 [D3DRS_POINTSIZE] = NINE_STATE_RASTERIZER | NINE_STATE_FF_OTHER,
 [D3DRS_POINTSIZE_MIN] = NINE_STATE_RASTERIZER | NINE_STATE_FF_OTHER | 
NINE_STATE_VS_PARAMS_MISC,
 [D3DRS_POINTSPRITEENABLE] = NINE_STATE_RASTERIZER,
-[D3DRS_POINTSCALEENABLE] = NINE_STATE_FF_OTHER,
+[D3DRS_POINTSCALEENABLE] = NINE_STATE_FF_SHADER,
 [D3DRS_POINTSCALE_A] = NINE_STATE_FF_OTHER,
 [D3DRS_POINTSCALE_B] = NINE_STATE_FF_OTHER,
 [D3DRS_POINTSCALE_C] = NINE_STATE_FF_OTHER,
@@ -3579,7 +3579,7 @@ const uint32_t nine_render_state_group[NINED3DRS_LAST + 
1] =
 [D3DRS_PATCHEDGESTYLE] = NINE_STATE_UNHANDLED,
 [D3DRS_DEBUGMONITORTOKEN] = NINE_STATE_UNHANDLED,
 [D3DRS_POINTSIZE_MAX] = NINE_STATE_RASTERIZER | NINE_STATE_FF_OTHER | 
NINE_STATE_VS_PARAMS_MISC,
-[D3DRS_INDEXEDVERTEXBLENDENABLE] = NINE_STATE_FF_OTHER,
+[D3DRS_INDEXEDVERTEXBLENDENABLE] = NINE_STATE_FF_SHADER,
 [D3DRS_COLORWRITEENABLE] = NINE_STATE_BLEND,
 [D3DRS_TWEENFACTOR] = NINE_STATE_FF_OTHER,
 [D3DRS_BLENDOP] = NINE_STATE_BLEND,
diff --git a/src/gallium/state_trackers/nine/nine_state.h 
b/src/gallium/state_trackers/nine/nine_state.h
index b8a74a4ee2f..77823655efa 100644
--- a/src/gallium/state_trackers/nine/nine_state.h
+++ b/src/gallium/state_trackers/nine/nine_state.h
@@ -90,6 +90,10 @@
 #define NINE_STATE_ALL  0x1fff
 #define NINE_STATE_UNHANDLED   (1 << 29)
 
+/* These states affect the ff shader key,
+ * which we recompute everytime. */
+#define NINE_STATE_FF_SHADER0
+
 #define NINE_STATE_COMMIT_DSA  (1 << 0)
 #define NINE_STATE_COMMIT_RASTERIZER (1 << 1)
 #define NINE_STATE_COMMIT_BLEND (1 << 2)
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 15/22] st/nine: Minor refactor of a few NINE_STATE_* flags

2018-09-23 Thread Axel Davy
Rename NINE_STATE_FOG_SHADER,
NINE_STATE_POINTSIZE_SHADER and NINE_STATE_PS1X_SHADER
into
NINE_STATE_VS_PARAMS_MISC and NINE_STATE_PS_PARAMS_MISC.

The behaviour is unchanged, except one minor change:
D3DRS_FOGTABLEMODE doesn't need to affect VS.

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/device9.c|  2 +-
 src/gallium/state_trackers/nine/nine_state.c | 16 +++-
 src/gallium/state_trackers/nine/nine_state.h | 13 ++---
 3 files changed, 14 insertions(+), 17 deletions(-)

diff --git a/src/gallium/state_trackers/nine/device9.c 
b/src/gallium/state_trackers/nine/device9.c
index 04f90ad8210..37fcba875ff 100644
--- a/src/gallium/state_trackers/nine/device9.c
+++ b/src/gallium/state_trackers/nine/device9.c
@@ -2538,7 +2538,7 @@ NineDevice9_SetTextureStageState( struct NineDevice9 
*This,
 
 if (unlikely(This->is_recording)) {
 if (Type == D3DTSS_TEXTURETRANSFORMFLAGS)
-state->changed.group |= NINE_STATE_PS1X_SHADER;
+state->changed.group |= NINE_STATE_PS_PARAMS_MISC;
 state->changed.group |= NINE_STATE_FF_PSSTAGES;
 state->ff.changed.tex_stage[Stage][Type / 32] |= 1 << (Type % 32);
 } else
diff --git a/src/gallium/state_trackers/nine/nine_state.c 
b/src/gallium/state_trackers/nine/nine_state.c
index c81a05a952b..3ab90633d25 100644
--- a/src/gallium/state_trackers/nine/nine_state.c
+++ b/src/gallium/state_trackers/nine/nine_state.c
@@ -1077,15 +1077,13 @@ commit_ps(struct NineDevice9 *device)
 #define NINE_STATE_SHADER_CHANGE_VS \
(NINE_STATE_VS | \
 NINE_STATE_TEXTURE |\
-NINE_STATE_FOG_SHADER | \
-NINE_STATE_POINTSIZE_SHADER | \
+NINE_STATE_VS_PARAMS_MISC | \
 NINE_STATE_SWVP)
 
 #define NINE_STATE_SHADER_CHANGE_PS \
(NINE_STATE_PS | \
 NINE_STATE_TEXTURE |\
-NINE_STATE_FOG_SHADER | \
-NINE_STATE_PS1X_SHADER)
+NINE_STATE_PS_PARAMS_MISC)
 
 #define NINE_STATE_FREQUENT \
(NINE_STATE_RASTERIZER | \
@@ -1861,7 +1859,7 @@ CSMT_ITEM_NO_WAIT(nine_context_set_texture_stage_state,
 bumpmap_index = 4 * 8 + 2 * Stage + 1;
 break;
 case D3DTSS_TEXTURETRANSFORMFLAGS:
-context->changed.group |= NINE_STATE_PS1X_SHADER;
+context->changed.group |= NINE_STATE_PS_PARAMS_MISC;
 break;
 default:
 break;
@@ -3531,10 +3529,10 @@ const uint32_t nine_render_state_group[NINED3DRS_LAST + 
1] =
 [D3DRS_ALPHAFUNC] = NINE_STATE_DSA,
 [D3DRS_DITHERENABLE] = NINE_STATE_BLEND,
 [D3DRS_ALPHABLENDENABLE] = NINE_STATE_BLEND,
-[D3DRS_FOGENABLE] = NINE_STATE_FF_OTHER | NINE_STATE_FOG_SHADER | 
NINE_STATE_PS_CONST,
+[D3DRS_FOGENABLE] = NINE_STATE_FF_OTHER | NINE_STATE_VS_PARAMS_MISC | 
NINE_STATE_PS_PARAMS_MISC | NINE_STATE_PS_CONST,
 [D3DRS_SPECULARENABLE] = NINE_STATE_FF_LIGHTING,
 [D3DRS_FOGCOLOR] = NINE_STATE_FF_OTHER | NINE_STATE_PS_CONST,
-[D3DRS_FOGTABLEMODE] = NINE_STATE_FF_OTHER | NINE_STATE_FOG_SHADER | 
NINE_STATE_PS_CONST,
+[D3DRS_FOGTABLEMODE] = NINE_STATE_FF_OTHER | NINE_STATE_PS_PARAMS_MISC | 
NINE_STATE_PS_CONST,
 [D3DRS_FOGSTART] = NINE_STATE_FF_OTHER | NINE_STATE_PS_CONST,
 [D3DRS_FOGEND] = NINE_STATE_FF_OTHER | NINE_STATE_PS_CONST,
 [D3DRS_FOGDENSITY] = NINE_STATE_FF_OTHER | NINE_STATE_PS_CONST,
@@ -3570,7 +3568,7 @@ const uint32_t nine_render_state_group[NINED3DRS_LAST + 
1] =
 [D3DRS_VERTEXBLEND] = NINE_STATE_FF_OTHER,
 [D3DRS_CLIPPLANEENABLE] = NINE_STATE_RASTERIZER,
 [D3DRS_POINTSIZE] = NINE_STATE_RASTERIZER,
-[D3DRS_POINTSIZE_MIN] = NINE_STATE_RASTERIZER | 
NINE_STATE_POINTSIZE_SHADER,
+[D3DRS_POINTSIZE_MIN] = NINE_STATE_RASTERIZER | NINE_STATE_VS_PARAMS_MISC,
 [D3DRS_POINTSPRITEENABLE] = NINE_STATE_RASTERIZER,
 [D3DRS_POINTSCALEENABLE] = NINE_STATE_FF_OTHER,
 [D3DRS_POINTSCALE_A] = NINE_STATE_FF_OTHER,
@@ -3580,7 +3578,7 @@ const uint32_t nine_render_state_group[NINED3DRS_LAST + 
1] =
 [D3DRS_MULTISAMPLEMASK] = NINE_STATE_SAMPLE_MASK,
 [D3DRS_PATCHEDGESTYLE] = NINE_STATE_UNHANDLED,
 [D3DRS_DEBUGMONITORTOKEN] = NINE_STATE_UNHANDLED,
-[D3DRS_POINTSIZE_MAX] = NINE_STATE_RASTERIZER | 
NINE_STATE_POINTSIZE_SHADER,
+[D3DRS_POINTSIZE_MAX] = NINE_STATE_RASTERIZER | NINE_STATE_VS_PARAMS_MISC,
 [D3DRS_INDEXEDVERTEXBLENDENABLE] = NINE_STATE_FF_OTHER,
 [D3DRS_COLORWRITEENABLE] = NINE_STATE_BLEND,
 [D3DRS_TWEENFACTOR] = NINE_STATE_FF_OTHER,
diff --git a/src/gallium/state_trackers/nine/nine_state.h 
b/src/gallium/state_trackers/nine/nine_state.h
index f5fd1ef9cd8..b8a74a4ee2f 100644
--- a/src/gallium/state_trackers/nine/nine_state.h
+++ b/src/gallium/state_trackers/nine/nine_state.h
@@ -83,13 +83,12 @@
 #define NINE_STATE_FF_VSTRANSF (1 << 22)
 #define NINE_STATE_FF_PSSTAGES (1 << 23)
 #define NINE_STATE_FF_OTHER(1 << 24)
-#define NINE_STATE_FOG_SHADER  (1 << 25)
-#define NINE_STATE_PS1X_SHADER (1 << 26)
-#define NINE_STATE_POINTSIZE_SHADER (1 

[Mesa-dev] [PATCH 10/22] st/nine: Init cursor position at device creation

2018-09-23 Thread Axel Davy
This is only useful for software cursor,
but at least now we won't start it at (0, 0).

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/device9.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/src/gallium/state_trackers/nine/device9.c 
b/src/gallium/state_trackers/nine/device9.c
index 9bb97bdf9c3..113ba9d975d 100644
--- a/src/gallium/state_trackers/nine/device9.c
+++ b/src/gallium/state_trackers/nine/device9.c
@@ -333,8 +333,11 @@ NineDevice9_ctor( struct NineDevice9 *This,
 This->cursor.hotspot.y = -1;
 This->cursor.w = This->cursor.h = 0;
 This->cursor.visible = FALSE;
-This->cursor.pos.x = 0;
-This->cursor.pos.y = 0;
+if (ID3DPresent_GetCursorPos(This->swapchains[0]->present, 
>cursor.pos) != S_OK) {
+This->cursor.pos.x = 0;
+This->cursor.pos.y = 0;
+}
+
 {
 struct pipe_resource tmpl;
 memset(, 0, sizeof(tmpl));
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 13/22] st/nine: Lock the entire buffer in some cases.

2018-09-23 Thread Axel Davy
Previously we had already found that for
MANAGED buffers the buffer started dirty
(which meant all writes out of bound
before the first draw call using the
buffer have to be taken into account).

Possibly it is the same for the other types of buffers.
For now always lock the entire buffer (starting from the offset)
for these (except for DYNAMIC buffers, which might hurt
performance too much).

Fixes: https://github.com/iXit/Mesa-3D/issues/301

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/buffer9.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/src/gallium/state_trackers/nine/buffer9.c 
b/src/gallium/state_trackers/nine/buffer9.c
index 69b08e8c10e..5880ee3c1a2 100644
--- a/src/gallium/state_trackers/nine/buffer9.c
+++ b/src/gallium/state_trackers/nine/buffer9.c
@@ -231,6 +231,14 @@ NineBuffer9_Lock( struct NineBuffer9 *This,
 user_warn(OffsetToLock != 0);
 }
 
+/* Write out of bound seems to have to be taken into account for these.
+ * TODO: Do more tests (is it only at buffer first lock ? etc).
+ * Since these buffers are supposed to be locked once and never
+ * writen again (MANAGED or DYNAMIC is used for the other uses cases),
+ * performance should be unaffected. */
+if (!(This->base.usage & D3DUSAGE_DYNAMIC) && This->base.pool != 
D3DPOOL_MANAGED)
+SizeToLock = This->size - OffsetToLock;
+
 u_box_1d(OffsetToLock, SizeToLock, );
 
 if (This->base.pool == D3DPOOL_MANAGED) {
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 11/22] st/nine: Avoid redundant SetCursorPos calls

2018-09-23 Thread Axel Davy
For some applications SetCursorPosition
is called when a cursor event is received.

Our SetCursorPosition was always calling
wine SetCursorPos which would trigger
a cursor event.

The infinite loop is avoided by not calling
SetCursorPos when the position hasn't changed.
Found thanks to wine tests.

Fixes irresponsive GUI for some applications.

Fixes: https://github.com/iXit/Mesa-3D/issues/173

Signed-off-by: Axel Davy 
CC: 
---
 src/gallium/state_trackers/nine/device9.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/gallium/state_trackers/nine/device9.c 
b/src/gallium/state_trackers/nine/device9.c
index 113ba9d975d..b3e56d70b74 100644
--- a/src/gallium/state_trackers/nine/device9.c
+++ b/src/gallium/state_trackers/nine/device9.c
@@ -791,6 +791,10 @@ NineDevice9_SetCursorPosition( struct NineDevice9 *This,
 
 DBG("This=%p X=%d Y=%d Flags=%d\n", This, X, Y, Flags);
 
+if (This->cursor.pos.x == X &&
+This->cursor.pos.y == Y)
+return;
+
 This->cursor.pos.x = X;
 This->cursor.pos.y = Y;
 
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 09/22] st/nine: Initialize manually cursor structure

2018-09-23 Thread Axel Davy
Initialize manually the cursor structure fields
for more clarity on its content.

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/device9.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/gallium/state_trackers/nine/device9.c 
b/src/gallium/state_trackers/nine/device9.c
index 293f63bd7b7..9bb97bdf9c3 100644
--- a/src/gallium/state_trackers/nine/device9.c
+++ b/src/gallium/state_trackers/nine/device9.c
@@ -331,6 +331,10 @@ NineDevice9_ctor( struct NineDevice9 *This,
 This->cursor.software = FALSE;
 This->cursor.hotspot.x = -1;
 This->cursor.hotspot.y = -1;
+This->cursor.w = This->cursor.h = 0;
+This->cursor.visible = FALSE;
+This->cursor.pos.x = 0;
+This->cursor.pos.y = 0;
 {
 struct pipe_resource tmpl;
 memset(, 0, sizeof(tmpl));
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 08/22] st/nine: Check if format is DS before retrieving flags

2018-09-23 Thread Axel Davy
d3d9_get_pipe_depth_format_bindings assumes the input format
is a depth stencil format.
Previously the user could hit this function with an invalid format.
Protect the last non protected call with a depth_stencil_format check.

Another solution is to have d3d9_get_pipe_depth_format_bindings
support non depth stencil format, but we don't want the user
to create depth buffers with d3d formats that can't be one,
it's better to check if the format can be depth buffer with d3d.

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/surface9.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/gallium/state_trackers/nine/surface9.c 
b/src/gallium/state_trackers/nine/surface9.c
index 71aa4f46ffd..5fd662fa049 100644
--- a/src/gallium/state_trackers/nine/surface9.c
+++ b/src/gallium/state_trackers/nine/surface9.c
@@ -111,6 +111,8 @@ NineSurface9_ctor( struct NineSurface9 *This,
 if (pDesc->Usage & D3DUSAGE_RENDERTARGET) {
 This->base.info.bind |= PIPE_BIND_RENDER_TARGET;
 } else if (pDesc->Usage & D3DUSAGE_DEPTHSTENCIL) {
+if (!depth_stencil_format(pDesc->Format))
+return D3DERR_INVALIDCALL;
 This->base.info.bind = 
d3d9_get_pipe_depth_format_bindings(pDesc->Format);
 if (TextureType)
 This->base.info.bind |= PIPE_BIND_SAMPLER_VIEW;
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 04/22] st/nine: Fix ff assignment with aliasing

2018-09-23 Thread Axel Davy
"tex_stage[s][D3DTSS_COLORARG0] >> 4" could be a two bit
number, thus colorarg_b4 was incorrectly set.

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/nine_ff.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/src/gallium/state_trackers/nine/nine_ff.c 
b/src/gallium/state_trackers/nine/nine_ff.c
index c64085b2ac4..2cb5f080be8 100644
--- a/src/gallium/state_trackers/nine/nine_ff.c
+++ b/src/gallium/state_trackers/nine/nine_ff.c
@@ -1768,20 +1768,20 @@ nine_ff_get_ps(struct NineDevice9 *device)
 if (used_c & 0x1) key.ts[s].colorarg0 = 
context->ff.tex_stage[s][D3DTSS_COLORARG0] & 0x7;
 if (used_c & 0x2) key.ts[s].colorarg1 = 
context->ff.tex_stage[s][D3DTSS_COLORARG1] & 0x7;
 if (used_c & 0x4) key.ts[s].colorarg2 = 
context->ff.tex_stage[s][D3DTSS_COLORARG2] & 0x7;
-if (used_c & 0x1) key.colorarg_b4[0] |= 
(context->ff.tex_stage[s][D3DTSS_COLORARG0] >> 4) << s;
-if (used_c & 0x1) key.colorarg_b5[0] |= 
(context->ff.tex_stage[s][D3DTSS_COLORARG0] >> 5) << s;
-if (used_c & 0x2) key.colorarg_b4[1] |= 
(context->ff.tex_stage[s][D3DTSS_COLORARG1] >> 4) << s;
-if (used_c & 0x2) key.colorarg_b5[1] |= 
(context->ff.tex_stage[s][D3DTSS_COLORARG1] >> 5) << s;
-if (used_c & 0x4) key.colorarg_b4[2] |= 
(context->ff.tex_stage[s][D3DTSS_COLORARG2] >> 4) << s;
-if (used_c & 0x4) key.colorarg_b5[2] |= 
(context->ff.tex_stage[s][D3DTSS_COLORARG2] >> 5) << s;
+if (used_c & 0x1) key.colorarg_b4[0] |= 
((context->ff.tex_stage[s][D3DTSS_COLORARG0] >> 4) & 0x1) << s;
+if (used_c & 0x1) key.colorarg_b5[0] |= 
((context->ff.tex_stage[s][D3DTSS_COLORARG0] >> 5) & 0x1) << s;
+if (used_c & 0x2) key.colorarg_b4[1] |= 
((context->ff.tex_stage[s][D3DTSS_COLORARG1] >> 4) & 0x1) << s;
+if (used_c & 0x2) key.colorarg_b5[1] |= 
((context->ff.tex_stage[s][D3DTSS_COLORARG1] >> 5) & 0x1) << s;
+if (used_c & 0x4) key.colorarg_b4[2] |= 
((context->ff.tex_stage[s][D3DTSS_COLORARG2] >> 4) & 0x1) << s;
+if (used_c & 0x4) key.colorarg_b5[2] |= 
((context->ff.tex_stage[s][D3DTSS_COLORARG2] >> 5) & 0x1) << s;
 }
 if (key.ts[s].alphaop != D3DTOP_DISABLE) {
 if (used_a & 0x1) key.ts[s].alphaarg0 = 
context->ff.tex_stage[s][D3DTSS_ALPHAARG0] & 0x7;
 if (used_a & 0x2) key.ts[s].alphaarg1 = 
context->ff.tex_stage[s][D3DTSS_ALPHAARG1] & 0x7;
 if (used_a & 0x4) key.ts[s].alphaarg2 = 
context->ff.tex_stage[s][D3DTSS_ALPHAARG2] & 0x7;
-if (used_a & 0x1) key.alphaarg_b4[0] |= 
(context->ff.tex_stage[s][D3DTSS_ALPHAARG0] >> 4) << s;
-if (used_a & 0x2) key.alphaarg_b4[1] |= 
(context->ff.tex_stage[s][D3DTSS_ALPHAARG1] >> 4) << s;
-if (used_a & 0x4) key.alphaarg_b4[2] |= 
(context->ff.tex_stage[s][D3DTSS_ALPHAARG2] >> 4) << s;
+if (used_a & 0x1) key.alphaarg_b4[0] |= 
((context->ff.tex_stage[s][D3DTSS_ALPHAARG0] >> 4) & 0x1) << s;
+if (used_a & 0x2) key.alphaarg_b4[1] |= 
((context->ff.tex_stage[s][D3DTSS_ALPHAARG1] >> 4) & 0x1) << s;
+if (used_a & 0x4) key.alphaarg_b4[2] |= 
((context->ff.tex_stage[s][D3DTSS_ALPHAARG2] >> 4) & 0x1) << s;
 }
 key.ts[s].resultarg = context->ff.tex_stage[s][D3DTSS_RESULTARG] == 
D3DTA_TEMP;
 
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 02/22] st/nine: Print transform matrices in debug

2018-09-23 Thread Axel Davy
This is useful to see the matrices content
in the log to debug.

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/device9.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/src/gallium/state_trackers/nine/device9.c 
b/src/gallium/state_trackers/nine/device9.c
index 150f5e3e05e..293f63bd7b7 100644
--- a/src/gallium/state_trackers/nine/device9.c
+++ b/src/gallium/state_trackers/nine/device9.c
@@ -1967,6 +1967,19 @@ NineDevice9_Clear( struct NineDevice9 *This,
 return D3D_OK;
 }
 
+static void
+nine_D3DMATRIX_print(const D3DMATRIX *M)
+{
+DBG("\n(%f %f %f %f)\n"
+"(%f %f %f %f)\n"
+"(%f %f %f %f)\n"
+"(%f %f %f %f)\n",
+M->m[0][0], M->m[0][1], M->m[0][2], M->m[0][3],
+M->m[1][0], M->m[1][1], M->m[1][2], M->m[1][3],
+M->m[2][0], M->m[2][1], M->m[2][2], M->m[2][3],
+M->m[3][0], M->m[3][1], M->m[3][2], M->m[3][3]);
+}
+
 HRESULT NINE_WINAPI
 NineDevice9_SetTransform( struct NineDevice9 *This,
   D3DTRANSFORMSTATETYPE State,
@@ -1978,6 +1991,7 @@ NineDevice9_SetTransform( struct NineDevice9 *This,
 DBG("This=%p State=%d pMatrix=%p\n", This, State, pMatrix);
 
 user_assert(M, D3DERR_INVALIDCALL);
+nine_D3DMATRIX_print(pMatrix);
 
 *M = *pMatrix;
 if (unlikely(This->is_recording)) {
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/22] st/nine: Implement predicated instructions

2018-09-23 Thread Axel Davy
Most of the work was already there, just not implemented.

Fixes: https://github.com/iXit/Mesa-3D/issues/318

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/nine_shader.c | 62 ---
 1 file changed, 55 insertions(+), 7 deletions(-)

diff --git a/src/gallium/state_trackers/nine/nine_shader.c 
b/src/gallium/state_trackers/nine/nine_shader.c
index 913508fb889..9e90da59597 100644
--- a/src/gallium/state_trackers/nine/nine_shader.c
+++ b/src/gallium/state_trackers/nine/nine_shader.c
@@ -481,6 +481,9 @@ struct shader_translator
 struct ureg_dst p;
 struct ureg_dst address;
 struct ureg_dst a0;
+struct ureg_dst predicate;
+struct ureg_dst predicate_tmp;
+struct ureg_dst predicate_dst;
 struct ureg_dst tS[8]; /* texture stage registers */
 struct ureg_dst tdst; /* scratch dst if we need extra modifiers */
 struct ureg_dst t[5]; /* scratch TEMPs */
@@ -496,6 +499,7 @@ struct shader_translator
 unsigned loop_labels[NINE_MAX_LOOP_DEPTH];
 unsigned cond_labels[NINE_MAX_COND_DEPTH];
 boolean loop_or_rep[NINE_MAX_LOOP_DEPTH]; /* true: loop, false: rep */
+boolean predicated_activated;
 
 unsigned *inst_labels; /* LABEL op */
 unsigned num_inst_labels;
@@ -975,7 +979,12 @@ tx_src_param(struct shader_translator *tx, const struct 
sm1_src_param *param)
 }
 break;
 case D3DSPR_PREDICATE:
-assert(!"D3DSPR_PREDICATE");
+if (ureg_dst_is_undef(tx->regs.predicate)) {
+/* Forbidden to use the predicate register before being set */
+tx->failure = TRUE;
+tx->regs.predicate = ureg_DECL_temporary(tx->ureg);
+}
+src = ureg_src(tx->regs.predicate);
 break;
 case D3DSPR_SAMPLER:
 assert(param->mod == NINED3DSPSM_NONE);
@@ -1157,11 +1166,15 @@ tx_src_param(struct shader_translator *tx, const struct 
sm1_src_param *param)
 src = ureg_src(tmp);
 break;
 case NINED3DSPSM_NOT:
-if (tx->native_integers) {
+if (tx->native_integers && param->file == D3DSPR_CONSTBOOL) {
 tmp = tx_scratch(tx);
 ureg_NOT(ureg, tmp, src);
 src = ureg_src(tmp);
 break;
+} else { /* predicate */
+tmp = tx_scratch(tx);
+ureg_ADD(ureg, tmp, ureg_imm1f(ureg, 1.0f), ureg_negate(src));
+src = ureg_src(tmp);
 }
 /* fall through */
 case NINED3DSPSM_COMP:
@@ -1292,7 +1305,9 @@ _tx_dst_param(struct shader_translator *tx, const struct 
sm1_dst_param *param)
 dst = tx->regs.oDepth; /* XXX: must write .z component */
 break;
 case D3DSPR_PREDICATE:
-assert(!"D3DSPR_PREDICATE");
+if (ureg_dst_is_undef(tx->regs.predicate))
+tx->regs.predicate = ureg_DECL_temporary(tx->ureg);
+dst = tx->regs.predicate;
 break;
 case D3DSPR_TEMPFLOAT16:
 DBG("unhandled D3DSPR: %u\n", param->file);
@@ -1309,6 +1324,11 @@ _tx_dst_param(struct shader_translator *tx, const struct 
sm1_dst_param *param)
 if (param->mod & NINED3DSPDM_SATURATE)
 dst = ureg_saturate(dst);
 
+if (tx->predicated_activated) {
+tx->regs.predicate_dst = dst;
+dst = tx->regs.predicate_tmp;
+}
+
 return dst;
 }
 
@@ -2891,12 +2911,24 @@ DECL_SPECIAL(TEXLDL)
 
 DECL_SPECIAL(SETP)
 {
-STUB(D3DERR_INVALIDCALL);
+const unsigned cmp_op = sm1_insn_flags_to_tgsi_setop(tx->insn.flags);
+struct ureg_dst dst = tx_dst_param(tx, >insn.dst[0]);
+struct ureg_src src[2] = {
+   tx_src_param(tx, >insn.src[0]),
+   tx_src_param(tx, >insn.src[1])
+};
+ureg_insn(tx->ureg, cmp_op, , 1, src, 2, 0);
+return D3D_OK;
 }
 
 DECL_SPECIAL(BREAKP)
 {
-STUB(D3DERR_INVALIDCALL);
+struct ureg_src src = tx_src_param(tx, >insn.src[0]);
+ureg_IF(tx->ureg, src, tx_cond(tx));
+ureg_BRK(tx->ureg);
+tx_endcond(tx);
+ureg_ENDIF(tx->ureg);
+return D3D_OK;
 }
 
 DECL_SPECIAL(PHASE)
@@ -3323,8 +3355,6 @@ sm1_parse_instruction(struct shader_translator *tx)
 insn->ndst = info->ndst;
 insn->nsrc = info->nsrc;
 
-assert(!insn->predicated && "TODO: predicated instructions");
-
 /* check version */
 {
 unsigned min = IS_VS ? info->vert_version.min : info->frag_version.min;
@@ -3353,12 +3383,30 @@ sm1_parse_instruction(struct shader_translator *tx)
 sm1_dump_instruction(insn, tx->cond_depth + tx->loop_depth);
 sm1_instruction_check(insn);
 
+if (insn->predicated) {
+tx->predicated_activated = true;
+if (ureg_dst_is_undef(tx->regs.predicate_tmp)) {
+tx->regs.predicate_tmp = ureg_DECL_temporary(tx->ureg);
+tx->regs.predicate_dst = ureg_DECL_temporary(tx->ure

[Mesa-dev] [PATCH 01/22] st/nine: Add ff key hash to help debug

2018-09-23 Thread Axel Davy
This is very useful to find in the log
the ff shader shource of a given call.

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/nine_ff.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/gallium/state_trackers/nine/nine_ff.c 
b/src/gallium/state_trackers/nine/nine_ff.c
index 58cc29b5e30..e5b0c3e1258 100644
--- a/src/gallium/state_trackers/nine/nine_ff.c
+++ b/src/gallium/state_trackers/nine/nine_ff.c
@@ -1683,6 +1683,7 @@ nine_ff_get_vs(struct NineDevice9 *device)
 key.tc_dim_output |= dim << (s * 3);
 }
 
+DBG("VS ff key hash: %x\n", nine_ff_vs_key_hash());
 vs = util_hash_table_get(device->ff.ht_vs, );
 if (vs)
 return vs;
@@ -1836,6 +1837,7 @@ nine_ff_get_ps(struct NineDevice9 *device)
 !(projection_matrix->_34 == 0.0f &&
   projection_matrix->_44 == 1.0f);
 
+DBG("PS ff key hash: %x\n", nine_ff_ps_key_hash());
 ps = util_hash_table_get(device->ff.ht_ps, );
 if (ps)
 return ps;
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 07/22] st/nine: Remove clamping when mul_zero_wins

2018-09-23 Thread Axel Davy
Tests show the clamping can be removed
when mul_zero_wins is supported.

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/nine_shader.c | 55 ---
 1 file changed, 47 insertions(+), 8 deletions(-)

diff --git a/src/gallium/state_trackers/nine/nine_shader.c 
b/src/gallium/state_trackers/nine/nine_shader.c
index 9e90da59597..5c33a6308c2 100644
--- a/src/gallium/state_trackers/nine/nine_shader.c
+++ b/src/gallium/state_trackers/nine/nine_shader.c
@@ -460,6 +460,7 @@ struct shader_translator
 boolean shift_wpos;
 boolean wpos_is_sysval;
 boolean face_is_sysval_integer;
+boolean mul_zero_wins;
 unsigned texcoord_sn;
 
 struct sm1_instruction insn; /* current instruction */
@@ -2293,15 +2294,46 @@ DECL_SPECIAL(POW)
 return D3D_OK;
 }
 
+/* Tests results on Win 10:
+ * NV (NVIDIA GeForce GT 635M)
+ * AMD (AMD Radeon HD 7730M)
+ * INTEL (Intel(R) HD Graphics 4000)
+ * PS2 and PS3:
+ * RCP and RSQ can generate inf on NV and AMD.
+ * RCP and RSQ are clamped on INTEL (+- FLT_MAX),
+ * NV: log not clamped
+ * AMD: log(0) is -FLT_MAX (but log(inf) is inf)
+ * INTEL: log(0) is -FLT_MAX and log(inf) is 127
+ * All devices have 0*anything = 0
+ *
+ * INTEL VS2 and VS3: same behaviour.
+ * Some differences VS2 and VS3 for constants defined with inf/NaN.
+ * While PS3, VS3 and PS2 keep NaN and Inf shader constants without change,
+ * VS2 seems to clamp to zero (may be test failure).
+ * AMD VS2: unknown, VS3: very likely behaviour of PS3
+ * NV VS2 and VS3: very likely behaviour of PS3
+ * For both, Inf in VS becomes NaN is PS
+ * "Very likely" because the test was less extensive.
+ *
+ * Thus all clamping can be removed for shaders 2 and 3,
+ * as long as 0*anything = 0.
+ * Else clamps to enforce 0*anything = 0 (anything being then
+ * neither inf or NaN, the user being unlikely to pass them
+ * as constant).
+ * The status for VS1 and PS1 is unknown.
+ */
+
 DECL_SPECIAL(RCP)
 {
 struct ureg_program *ureg = tx->ureg;
 struct ureg_dst dst = tx_dst_param(tx, >insn.dst[0]);
 struct ureg_src src = tx_src_param(tx, >insn.src[0]);
-struct ureg_dst tmp = tx_scratch(tx);
+struct ureg_dst tmp = tx->mul_zero_wins ? dst : tx_scratch(tx);
 ureg_RCP(ureg, tmp, src);
-ureg_MIN(ureg, tmp, ureg_imm1f(ureg, FLT_MAX), ureg_src(tmp));
-ureg_MAX(ureg, dst, ureg_imm1f(ureg, -FLT_MAX), ureg_src(tmp));
+if (!tx->mul_zero_wins) {
+ureg_MIN(ureg, tmp, ureg_imm1f(ureg, FLT_MAX), ureg_src(tmp));
+ureg_MAX(ureg, dst, ureg_imm1f(ureg, -FLT_MAX), ureg_src(tmp));
+}
 return D3D_OK;
 }
 
@@ -2310,9 +2342,10 @@ DECL_SPECIAL(RSQ)
 struct ureg_program *ureg = tx->ureg;
 struct ureg_dst dst = tx_dst_param(tx, >insn.dst[0]);
 struct ureg_src src = tx_src_param(tx, >insn.src[0]);
-struct ureg_dst tmp = tx_scratch(tx);
+struct ureg_dst tmp = tx->mul_zero_wins ? dst : tx_scratch(tx);
 ureg_RSQ(ureg, tmp, ureg_abs(src));
-ureg_MIN(ureg, dst, ureg_imm1f(ureg, FLT_MAX), ureg_src(tmp));
+if (!tx->mul_zero_wins)
+ureg_MIN(ureg, dst, ureg_imm1f(ureg, FLT_MAX), ureg_src(tmp));
 return D3D_OK;
 }
 
@@ -2323,7 +2356,11 @@ DECL_SPECIAL(LOG)
 struct ureg_dst dst = tx_dst_param(tx, >insn.dst[0]);
 struct ureg_src src = tx_src_param(tx, >insn.src[0]);
 ureg_LG2(ureg, tmp, ureg_abs(src));
-ureg_MAX(ureg, dst, ureg_imm1f(ureg, -FLT_MAX), tx_src_scalar(tmp));
+if (tx->mul_zero_wins) {
+ureg_MOV(ureg, dst, tx_src_scalar(tmp));
+} else {
+ureg_MAX(ureg, dst, ureg_imm1f(ureg, -FLT_MAX), tx_src_scalar(tmp));
+}
 return D3D_OK;
 }
 
@@ -2353,7 +2390,8 @@ DECL_SPECIAL(NRM)
 struct ureg_src src = tx_src_param(tx, >insn.src[0]);
 ureg_DP3(ureg, tmp, src, src);
 ureg_RSQ(ureg, tmp, nrm);
-ureg_MIN(ureg, tmp, ureg_imm1f(ureg, FLT_MAX), nrm);
+if (!tx->mul_zero_wins)
+ureg_MIN(ureg, tmp, ureg_imm1f(ureg, FLT_MAX), nrm);
 ureg_MUL(ureg, dst, src, nrm);
 return D3D_OK;
 }
@@ -3637,7 +3675,8 @@ nine_translate_shader(struct NineDevice9 *device, struct 
nine_shader_info *info,
 ureg_property(tx->ureg, TGSI_PROPERTY_FS_COORD_PIXEL_CENTER, 
TGSI_FS_COORD_PIXEL_CENTER_INTEGER);
 }
 
-if (GET_CAP(TGSI_MUL_ZERO_WINS))
+tx->mul_zero_wins = GET_CAP(TGSI_MUL_ZERO_WINS);
+if (tx->mul_zero_wins)
ureg_property(tx->ureg, TGSI_PROPERTY_MUL_ZERO_WINS, 1);
 
 while (!sm1_parse_eof(tx) && !tx->failure)
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 05/22] st/nine: Fix aliased read in ff

2018-09-23 Thread Axel Davy
Fix aliasing of colorarg_b4 with
colorarg_b5.

Fixes: https://github.com/iXit/Mesa-3D/issues/302

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/nine_ff.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/gallium/state_trackers/nine/nine_ff.c 
b/src/gallium/state_trackers/nine/nine_ff.c
index 2cb5f080be8..453f280c9fc 100644
--- a/src/gallium/state_trackers/nine/nine_ff.c
+++ b/src/gallium/state_trackers/nine/nine_ff.c
@@ -1463,9 +1463,9 @@ nine_ff_build_ps(struct NineDevice9 *device, struct 
nine_ff_ps_key *key)
 ureg_MUL(ureg, ps.rMod, ps.rCurSrc, ps.rTexSrc);
 }
 
-colorarg[0] = (key->ts[s].colorarg0 | ((key->colorarg_b4[0] >> s) << 
4) | ((key->colorarg_b5[0] >> s) << 5)) & 0x3f;
-colorarg[1] = (key->ts[s].colorarg1 | ((key->colorarg_b4[1] >> s) << 
4) | ((key->colorarg_b5[1] >> s) << 5)) & 0x3f;
-colorarg[2] = (key->ts[s].colorarg2 | ((key->colorarg_b4[2] >> s) << 
4) | ((key->colorarg_b5[2] >> s) << 5)) & 0x3f;
+colorarg[0] = (key->ts[s].colorarg0 | (((key->colorarg_b4[0] >> s) & 
0x1) << 4) | ((key->colorarg_b5[0] >> s) << 5)) & 0x3f;
+colorarg[1] = (key->ts[s].colorarg1 | (((key->colorarg_b4[1] >> s) & 
0x1) << 4) | ((key->colorarg_b5[1] >> s) << 5)) & 0x3f;
+colorarg[2] = (key->ts[s].colorarg2 | (((key->colorarg_b4[2] >> s) & 
0x1) << 4) | ((key->colorarg_b5[2] >> s) << 5)) & 0x3f;
 alphaarg[0] = (key->ts[s].alphaarg0 | ((key->alphaarg_b4[0] >> s) << 
4)) & 0x1f;
 alphaarg[1] = (key->ts[s].alphaarg1 | ((key->alphaarg_b4[1] >> s) << 
4)) & 0x1f;
 alphaarg[2] = (key->ts[s].alphaarg2 | ((key->alphaarg_b4[2] >> s) << 
4)) & 0x1f;
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/22] st/nine: Clarify some ff assignments

2018-09-23 Thread Axel Davy
colorarg0, etc are 3 bits wide.
Make the code more readable by adding an & 0x7
to further indicate we only remember the first 3 bits only.

The 4th bit is always 0,
and colorarg_b4, colorarg_b5, etc are used to store
the 5th and 6th bits.

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/nine_ff.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/src/gallium/state_trackers/nine/nine_ff.c 
b/src/gallium/state_trackers/nine/nine_ff.c
index e5b0c3e1258..c64085b2ac4 100644
--- a/src/gallium/state_trackers/nine/nine_ff.c
+++ b/src/gallium/state_trackers/nine/nine_ff.c
@@ -1765,9 +1765,9 @@ nine_ff_get_ps(struct NineDevice9 *device)
 sampler_mask |= (1 << s);
 
 if (key.ts[s].colorop != D3DTOP_DISABLE) {
-if (used_c & 0x1) key.ts[s].colorarg0 = 
context->ff.tex_stage[s][D3DTSS_COLORARG0];
-if (used_c & 0x2) key.ts[s].colorarg1 = 
context->ff.tex_stage[s][D3DTSS_COLORARG1];
-if (used_c & 0x4) key.ts[s].colorarg2 = 
context->ff.tex_stage[s][D3DTSS_COLORARG2];
+if (used_c & 0x1) key.ts[s].colorarg0 = 
context->ff.tex_stage[s][D3DTSS_COLORARG0] & 0x7;
+if (used_c & 0x2) key.ts[s].colorarg1 = 
context->ff.tex_stage[s][D3DTSS_COLORARG1] & 0x7;
+if (used_c & 0x4) key.ts[s].colorarg2 = 
context->ff.tex_stage[s][D3DTSS_COLORARG2] & 0x7;
 if (used_c & 0x1) key.colorarg_b4[0] |= 
(context->ff.tex_stage[s][D3DTSS_COLORARG0] >> 4) << s;
 if (used_c & 0x1) key.colorarg_b5[0] |= 
(context->ff.tex_stage[s][D3DTSS_COLORARG0] >> 5) << s;
 if (used_c & 0x2) key.colorarg_b4[1] |= 
(context->ff.tex_stage[s][D3DTSS_COLORARG1] >> 4) << s;
@@ -1776,9 +1776,9 @@ nine_ff_get_ps(struct NineDevice9 *device)
 if (used_c & 0x4) key.colorarg_b5[2] |= 
(context->ff.tex_stage[s][D3DTSS_COLORARG2] >> 5) << s;
 }
 if (key.ts[s].alphaop != D3DTOP_DISABLE) {
-if (used_a & 0x1) key.ts[s].alphaarg0 = 
context->ff.tex_stage[s][D3DTSS_ALPHAARG0];
-if (used_a & 0x2) key.ts[s].alphaarg1 = 
context->ff.tex_stage[s][D3DTSS_ALPHAARG1];
-if (used_a & 0x4) key.ts[s].alphaarg2 = 
context->ff.tex_stage[s][D3DTSS_ALPHAARG2];
+if (used_a & 0x1) key.ts[s].alphaarg0 = 
context->ff.tex_stage[s][D3DTSS_ALPHAARG0] & 0x7;
+if (used_a & 0x2) key.ts[s].alphaarg1 = 
context->ff.tex_stage[s][D3DTSS_ALPHAARG1] & 0x7;
+if (used_a & 0x4) key.ts[s].alphaarg2 = 
context->ff.tex_stage[s][D3DTSS_ALPHAARG2] & 0x7;
 if (used_a & 0x1) key.alphaarg_b4[0] |= 
(context->ff.tex_stage[s][D3DTSS_ALPHAARG0] >> 4) << s;
 if (used_a & 0x2) key.alphaarg_b4[1] |= 
(context->ff.tex_stage[s][D3DTSS_ALPHAARG1] >> 4) << s;
 if (used_a & 0x4) key.alphaarg_b4[2] |= 
(context->ff.tex_stage[s][D3DTSS_ALPHAARG2] >> 4) << s;
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] st/nine: Clamp RCP when 0*inf!=0

2018-09-12 Thread Axel Davy

On 9/12/18 11:44 PM, Roland Scheidegger wrote:

Am 12.09.2018 um 23:43 schrieb Roland Scheidegger:


I small precision I want to add: This is not the only place clamping
makes a difference.

Indeed else MUL_ZERO_WINS would be safe to use and remove all the clamping.


The rasterizers can produce NaN when given Inf in the vertex shader on
some devices for example,

and I think on some devices inf and FLT_MAX give different color in the
pixel shader.


Thus why I want to test carefully what do the other vendors for all the
shader versions (we know already there are

behaviour changes for some) and check with the behaviours mentionned
above, before removing the clamps when MUL_ZERO_WINS.

Yes it's all quite a mess. d3d9 rules are awkward (if they are even
documented), whereas gl may do whatever (personally I would consider at
least for core contexts everything not following ieee754 rules a bug,
well maybe not for gles...).

Forgot to mention, of course unless forced with things like mul_zero_wins.

Roland



Yes, it is indeed quite messy. The online official documentation says a 
thing,


and the drivers do differently. However all windows driver do mul_zero_wins.


Axel

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] st/nine: Clamp RCP when 0*inf!=0

2018-09-12 Thread Axel Davy

On 9/9/18 9:40 PM, Ilia Mirkin wrote:

On Sun, Sep 9, 2018 at 3:19 PM, Axel Davy  wrote:

Tests showed Intel on windows does always clamp
RCP, RSQ and LOG (thus preventing inf/nan generation),
for all shader versions (some vendor behaviours vary
with shader versions).

By the way, this happens because on Intel, the ALU is put into a
special mode where it just doesn't generate NaN's at all under any
conditions. I don't think that other vendors operate this way.



I've found the code source of my tests, and completed them to have a 
better picture.


The conclusion is that the clamping can be safely removed all the time 
when mul_zero_wins.


I produced a commit in that purpose:

https://github.com/iXit/Mesa-3D/commit/ed82c87da2799a40d8f7b87f8ff99d6f20a9f601


I'll send it to mesa-dev when our testers can check there is no regression.

In the commit I detail the test results.


I still want to get "st/nine: Clamp RCP when 0*inf!=0" merged, 
especially for stable.



Yours,


Axel

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] st/nine: Clamp RCP when 0*inf!=0

2018-09-12 Thread Axel Davy

On 9/12/18 8:17 AM, Axel Davy wrote:


The goal is to catch inf and -inf and replace them by FLT_MAX and 
-FLT_MAX.


Without, the NaN would appear when doing mul or mad.

Axel



I small precision I want to add: This is not the only place clamping 
makes a difference.


Indeed else MUL_ZERO_WINS would be safe to use and remove all the clamping.


The rasterizers can produce NaN when given Inf in the vertex shader on 
some devices for example,


and I think on some devices inf and FLT_MAX give different color in the 
pixel shader.



Thus why I want to test carefully what do the other vendors for all the 
shader versions (we know already there are


behaviour changes for some) and check with the behaviours mentionned 
above, before removing the clamps when MUL_ZERO_WINS.




Axel

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] st/nine: Clamp RCP when 0*inf!=0

2018-09-12 Thread Axel Davy

On 9/11/18 11:28 PM, Roland Scheidegger wrote:

Am 09.09.2018 um 21:19 schrieb Axel Davy:

Tests done on several devices of all 3 vendors and
of different generations showed that there are several
ways of handling infs and NaN for d3d9.

Tests showed Intel on windows does always clamp
RCP, RSQ and LOG (thus preventing inf/nan generation),
for all shader versions (some vendor behaviours vary
with shader versions).
Doing this in nine avoids 0*inf issues for drivers
that can't generate 0*inf=0 (which is controled by
TGSI's MUL_ZERO_WINS).

For now clamp for all drivers. An ulterior optimization
would be to avoid clamping for drivers with MUL_ZERO_WINS
for the specific shader versions where NV or AMD don't
clamp.

LOG and RSQ being already clamped, this patch only
clamps RCP.

Fixes: 
https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FiXit%2FMesa-3D%2Fissues%2F316data=02%7C01%7Csroland%40vmware.com%7Cdccfde1e101a477ee00808d6168941d4%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C1%7C0%7C636721176130476488sdata=JbGHhpPJPgUcw4i%2FSYN%2B30a7okSb5sT8bR%2B4PKvCnyM%3Dreserved=0

Signed-off-by: Axel Davy 
CC: 
---
  src/gallium/state_trackers/nine/nine_shader.c | 14 +-
  1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/nine/nine_shader.c 
b/src/gallium/state_trackers/nine/nine_shader.c
index 7db07d8f69..5b8ad3f161 100644
--- a/src/gallium/state_trackers/nine/nine_shader.c
+++ b/src/gallium/state_trackers/nine/nine_shader.c
@@ -2273,6 +2273,18 @@ DECL_SPECIAL(POW)
  return D3D_OK;
  }
  
+DECL_SPECIAL(RCP)

+{
+struct ureg_program *ureg = tx->ureg;
+struct ureg_dst dst = tx_dst_param(tx, >insn.dst[0]);
+struct ureg_src src = tx_src_param(tx, >insn.src[0]);
+struct ureg_dst tmp = tx_scratch(tx);
+ureg_RCP(ureg, tmp, src);
+ureg_MIN(ureg, tmp, ureg_imm1f(ureg, FLT_MAX), ureg_src(tmp));
+ureg_MAX(ureg, dst, ureg_imm1f(ureg, -FLT_MAX), ureg_src(tmp));

I'm not sure what the ureg_MAX is supposed to do?
The min already gets rid of all NaNs (iff the driver follows the
d3d10-mandated behavior of picking the non-nan number for min/max if one
of the values is a NaN - if not doing both min/max isn't going to help
neither...).

Roland


The goal is to catch inf and -inf and replace them by FLT_MAX and -FLT_MAX.

Without, the NaN would appear when doing mul or mad.

Axel






+return D3D_OK;
+}
+
  DECL_SPECIAL(RSQ)
  {
  struct ureg_program *ureg = tx->ureg;
@@ -2909,7 +2921,7 @@ static const struct sm1_op_info inst_table[] =
  _OPI(SUB, NOP, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, SPECIAL(SUB)), /* 3 
*/
  _OPI(MAD, MAD, V(0,0), V(3,0), V(0,0), V(3,0), 1, 3, NULL), /* 4 */
  _OPI(MUL, MUL, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* 5 */
-_OPI(RCP, RCP, V(0,0), V(3,0), V(0,0), V(3,0), 1, 1, NULL), /* 6 */
+_OPI(RCP, RCP, V(0,0), V(3,0), V(0,0), V(3,0), 1, 1, SPECIAL(RCP)), /* 6 */
  _OPI(RSQ, RSQ, V(0,0), V(3,0), V(0,0), V(3,0), 1, 1, SPECIAL(RSQ)), /* 7 
*/
  _OPI(DP3, DP3, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* 8 */
  _OPI(DP4, DP4, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* 9 */





___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] st/nine: Clamp RCP when 0*inf!=0

2018-09-09 Thread Axel Davy

On 9/9/18 9:40 PM, Ilia Mirkin wrote:

On Sun, Sep 9, 2018 at 3:19 PM, Axel Davy  wrote:

Tests showed Intel on windows does always clamp
RCP, RSQ and LOG (thus preventing inf/nan generation),
for all shader versions (some vendor behaviours vary
with shader versions).

By the way, this happens because on Intel, the ALU is put into a
special mode where it just doesn't generate NaN's at all under any
conditions. I don't think that other vendors operate this way.


Yes exactly, though the documentation, if I remember correctly,

says the flag is for shaders version <= 2.0, but tests showed the clamping

was also used for version 3.0.


The point is that games work properly on all vendors. Thus picking intel 
behaviour is safe,


even if we use a card from another vendor.


I think the inf/nan behaviour on AMD and NVidia was with compute 
applications in mind (there was no


other way to do compute back then), but games weren't interested in Infs 
and NaN.



Also our tests showed that all vendors have 0*inf = 0 and 0*NaN = 0, 
even if rcp, log and rsq are clamped.


(inf and NaN can be passed via constants or inputs for such tests).


Axel

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] st/nine: Clamp RCP when 0*inf!=0

2018-09-09 Thread Axel Davy


On 9/9/18 9:35 PM, Ilia Mirkin wrote:

On Sun, Sep 9, 2018 at 3:19 PM, Axel Davy  wrote:

For now clamp for all drivers. An ulterior optimization
would be to avoid clamping for drivers with MUL_ZERO_WINS
for the specific shader versions where NV or AMD don't
clamp.

Too bad. The whole point of this feature was for nine to use it.
Should we just drop that logic?

   -ilia



I have lost my notes on the detailed results of my tests on the 3 
vendors (and the test itself).

But I plan on finding them back and complete them.
Then I want to send a patch to remove any clamping when possible.

I remember there was some weird behaviours depending of whether it is vs 
or ps and shader version,

but that mostly we should be able to get rid of the clamping in most cases.

Axel
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/5] st/nine: Add new helper for object creation with bind

2018-09-09 Thread Axel Davy
Add a new helper to create objects starting with a bind
count instead of a ref count.

Signed-off-by: Axel Davy 
---
 .../state_trackers/nine/nine_helpers.h| 26 +++
 1 file changed, 26 insertions(+)

diff --git a/src/gallium/state_trackers/nine/nine_helpers.h 
b/src/gallium/state_trackers/nine/nine_helpers.h
index a0c55bd9ee..c14dd1c04f 100644
--- a/src/gallium/state_trackers/nine/nine_helpers.h
+++ b/src/gallium/state_trackers/nine/nine_helpers.h
@@ -99,6 +99,32 @@ static inline void _nine_bind(void **dst, void *obj)
 } \
 return D3D_OK
 
+#define NINE_DEVICE_CHILD_BIND_NEW(nine, out, dev, ...) \
+{ \
+struct NineUnknownParams __params; \
+struct Nine##nine *__data; \
+ \
+__data = CALLOC_STRUCT(Nine##nine); \
+if (!__data) { return E_OUTOFMEMORY; } \
+ \
+__params.vtable = ((dev)->params.BehaviorFlags & 
D3DCREATE_MULTITHREADED) ? ##nine##_vtable : ##nine##_vtable; \
+__params.guids = Nine##nine##_IIDs; \
+__params.dtor = (void *)Nine##nine##_dtor; \
+__params.container = NULL; \
+__params.device = dev; \
+__params.start_with_bind_not_ref = true; \
+{ \
+HRESULT __hr = Nine##nine##_ctor(__data, &__params, ## 
__VA_ARGS__); \
+if (FAILED(__hr)) { \
+Nine##nine##_dtor(__data); \
+return __hr; \
+} \
+} \
+ \
+*(out) = __data; \
+} \
+return D3D_OK
+
 #define NINE_NEW(nine, out, lock, ...) \
 { \
 struct NineUnknownParams __params; \
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/5] st/nine: Avoid RefToBind calls in ff

2018-09-09 Thread Axel Davy
When using csmt, ff shader creation happens on the csmt
thread. Creating the shaders, then calling RefToBind causes
the device ref to be increased then decreased.

However the device dtor assumes than no work pending on the
csmt thread could increase the device ref, leading to hang.

The issue is avoided by creating the shaders with a bind
count directly.

Fixes: https://github.com/iXit/Mesa-3D/issues/295

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/nine_ff.c   | 2 --
 src/gallium/state_trackers/nine/pixelshader9.c  | 6 +-
 src/gallium/state_trackers/nine/vertexshader9.c | 6 +-
 3 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/src/gallium/state_trackers/nine/nine_ff.c 
b/src/gallium/state_trackers/nine/nine_ff.c
index fabc1d3b88..58cc29b5e3 100644
--- a/src/gallium/state_trackers/nine/nine_ff.c
+++ b/src/gallium/state_trackers/nine/nine_ff.c
@@ -1698,7 +1698,6 @@ nine_ff_get_vs(struct NineDevice9 *device)
 (void)err;
 assert(err == PIPE_OK);
 device->ff.num_vs++;
-NineUnknown_ConvertRefToBind(NineUnknown(vs));
 
 vs->num_inputs = bld.num_inputs;
 for (n = 0; n < bld.num_inputs; ++n)
@@ -1850,7 +1849,6 @@ nine_ff_get_ps(struct NineDevice9 *device)
 (void)err;
 assert(err == PIPE_OK);
 device->ff.num_ps++;
-NineUnknown_ConvertRefToBind(NineUnknown(ps));
 
 ps->rt_mask = 0x1;
 ps->sampler_mask = sampler_mask;
diff --git a/src/gallium/state_trackers/nine/pixelshader9.c 
b/src/gallium/state_trackers/nine/pixelshader9.c
index bfc395cdf5..6f053f709b 100644
--- a/src/gallium/state_trackers/nine/pixelshader9.c
+++ b/src/gallium/state_trackers/nine/pixelshader9.c
@@ -203,5 +203,9 @@ NinePixelShader9_new( struct NineDevice9 *pDevice,
   struct NinePixelShader9 **ppOut,
   const DWORD *pFunction, void *cso )
 {
-NINE_DEVICE_CHILD_NEW(PixelShader9, ppOut, pDevice, pFunction, cso);
+if (cso) { /* ff shader. Needs to start with bind count */
+NINE_DEVICE_CHILD_BIND_NEW(PixelShader9, ppOut, pDevice, pFunction, 
cso);
+} else {
+NINE_DEVICE_CHILD_NEW(PixelShader9, ppOut, pDevice, pFunction, cso);
+}
 }
diff --git a/src/gallium/state_trackers/nine/vertexshader9.c 
b/src/gallium/state_trackers/nine/vertexshader9.c
index a4228af157..f104a9ad13 100644
--- a/src/gallium/state_trackers/nine/vertexshader9.c
+++ b/src/gallium/state_trackers/nine/vertexshader9.c
@@ -262,5 +262,9 @@ NineVertexShader9_new( struct NineDevice9 *pDevice,
struct NineVertexShader9 **ppOut,
const DWORD *pFunction, void *cso )
 {
-NINE_DEVICE_CHILD_NEW(VertexShader9, ppOut, pDevice, pFunction, cso);
+if (cso) {
+NINE_DEVICE_CHILD_BIND_NEW(VertexShader9, ppOut, pDevice, pFunction, 
cso);
+} else {
+NINE_DEVICE_CHILD_NEW(VertexShader9, ppOut, pDevice, pFunction, cso);
+}
 }
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   3   4   5   6   7   8   9   10   >