Re: [Mesa-dev] Fwd: [PATCH 5/6] st/mesa: implement blit-based ReadPixels

2013-03-15 Thread Michel Dänzer
On Don, 2013-03-14 at 23:35 +0100, Martin Andersson wrote: 
 On Thu, Mar 14, 2013 at 7:45 PM, Marek Olšák mar...@gmail.com wrote:
 
  +   /* See if the texture format already matches the format and type,
  +* in which case the memcpy-based fast path will likely be used and
  +* we don't have to blit. */
  +   if (_mesa_format_matches_format_and_type(rb-Format, format,
  +type, pack-SwapBytes)) {
  +  goto fallback;
  +   }
 
 On my system (Intel core i7 and AMD 6950) the memcpy fast-path takes
 around 210 milliseconds and the blit path takes around 9 milliseconds, for a
 1920x1200 image. So it is much faster, at least on my system, to use the
 blit path even when the mesa format matches format and type.
 
 To test this I forced the use of the memcpy fast-path (the mesa format was
 MESA_FORMAT_XRGB, the format was GL_BGRA and the type was
 GL_UNSIGNED_BYTE). I visually inspected the resulting image and I could
 not see anything wrong with it.
 
 But perhaps forcing the use of the memcpy fast-path invalidates the results.
 Or there might be some other issues that I am missing, but I just wanted to
 say that on my system it is better to remove this check.

If the read buffer is in VRAM, reading from it with the CPU directly
will be very slow.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast |  Debian, X and DRI developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 6954] Compile fails on Solaris 10 x86

2013-03-15 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=6954

--- Comment #19 from chemtech patsev.an...@gmail.com ---
Corey,
Do you still experience this issue with newer soft?
Please check the status of your issue.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 4965] Windows GDI driver doublebuffer + alpha triggers assertation

2013-03-15 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=4965

--- Comment #3 from chemtech patsev.an...@gmail.com ---
Gregor Anich,
Do you still experience this issue with newer soft?
Please check the status of your issue.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 62362] New: Crash when using Wayland EGL platform

2013-03-15 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=62362

  Priority: medium
Bug ID: 62362
  Assignee: mesa-dev@lists.freedesktop.org
   Summary: Crash when using Wayland EGL platform
  Severity: normal
Classification: Unclassified
OS: Linux (All)
  Reporter: geoma...@gmail.com
  Hardware: All
Status: NEW
   Version: git
 Component: Mesa core
   Product: Mesa

Crash in Mesa when using Wayland EGL platform:

The root cause is incorrect parameter for wayland_roundtrip() in function
wayland_shm_display_init_screen (see file
src/gallium/state_trackers/egl/wayland/native_shm.c).

The following patch fixes the issue:

diff --git a/src/gallium/state_trackers/egl/wayland/native_shm.c
b/src/gallium/state_trackers/egl/wayland/native_shm.c
index a959237..e543619 100644
--- a/src/gallium/state_trackers/egl/wayland/native_shm.c
+++ b/src/gallium/state_trackers/egl/wayland/native_shm.c
@@ -163,7 +163,8 @@ wayland_shm_display_init_screen(struct native_display
*ndpy)
   return FALSE;

if (shmdpy-base.formats == 0)
-  wayland_roundtrip(shmdpy-base.dpy);
+  wayland_roundtrip(shmdpy-base);
+
if (shmdpy-base.formats == 0)
   return FALSE;

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] R600: Relax some vector constraints on Dot4.

2013-03-15 Thread Christian König

Hi Vincent,

while I really appreciate your work, I think you're development is going 
into the wrong direction here. Those copies you're trying to avoid (not 
only with this patch, but also with the previous REG_SEQUENCE patches), 
shouldn't happen in the first place. I'm not so deeply into the R600 
part of our LLVM backend that I can say that I'm 100% sure, but to me 
that just looks like workarounds to an incorrect defined register space.


Here is an simple example from SI, that should show how things are 
intended to work. It's a simple 2D texture fetch, the coordinates of 
that this fetch are usually provided in an two element vector build of 
VGPRs (I use a 2D fetch just for simplicity, a 3D fetch with explicit 
LOD would work the same way and would use a four element vector).


After ISel the assembler code starts with something like this (simplified):
...
%vreg13def,tied1 = V_INTERP_P2_F32 ...
...
%vreg17def,tied1 = V_INTERP_P2_F32 ...
...
%vreg22def = IMPLICIT_DEF; VReg_64:%vreg22
%vreg21def,tied1 = INSERT_SUBREG %vreg22tied0, %vreg13kill, sub0; 
VReg_64:%vreg21,%vreg22 VReg_32:%vreg13
%vreg23def,tied1 = INSERT_SUBREG %vreg21tied0, %vreg17kill, sub1; 
VReg_64:%vreg23,%vreg21 VReg_32:%vreg17

%vreg24def = IMAGE_SAMPLE 15, 0, 0, 0, 0, 0, 0, 0, %vreg23kill, 

As you can see the sub components of the vectors are inserted/extracted 
just like it happens on R600, but the registerallocater is capable of 
handling that much better than on R600 and so avoiding the (sometimes 
quite expensive) COPY operations in the first place. The resulting code 
looks like this:


...
%vreg23:sub0def,tied1 = V_INTERP_P2_F32 ...
...
%vreg23:sub1def,tied1 = V_INTERP_P2_F32 ...
...
%vreg24def = IMAGE_SAMPLE 15, 0, 0, 0, 0, 0, 0, 0, %vreg23, ...

So INSERT_SUBREG isn't replaced with a COPY like on R600, but instead 
the V_INTERP_P2_F32 instructions can write directly to the appropriate 
sub register component.


I'm not 100% sure why this doesn't work the same way on R600, but I 
think it might be a good idea figuring that out.


Cheers,
Christian.

Am 14.03.2013 21:51, schrieb Vincent Lejeune:

Dot4 now uses 8 scalar operands instead of 2 vectors one which allows register
coalescer to remove some unneeded COPY.
This patch also defines some structures/functions that can be used to handle
every vector instructions (CUBE, Cayman special instructions...) in a similar
fashion.
---
  lib/Target/R600/AMDGPUISelLowering.h|  1 +
  lib/Target/R600/R600Defines.h   | 74 
  lib/Target/R600/R600ExpandSpecialInstrs.cpp | 25 
  lib/Target/R600/R600ISelLowering.cpp| 21 +++
  lib/Target/R600/R600InstrInfo.cpp   | 88 +
  lib/Target/R600/R600InstrInfo.h |  5 ++
  lib/Target/R600/R600Instructions.td | 51 -
  lib/Target/R600/R600MachineScheduler.cpp|  2 +
  8 files changed, 266 insertions(+), 1 deletion(-)

diff --git a/lib/Target/R600/AMDGPUISelLowering.h 
b/lib/Target/R600/AMDGPUISelLowering.h
index f31b646..f9f5a60 100644
--- a/lib/Target/R600/AMDGPUISelLowering.h
+++ b/lib/Target/R600/AMDGPUISelLowering.h
@@ -125,6 +125,7 @@ enum {
SMIN,
UMIN,
URECIP,
+  DOT4,
EXPORT,
CONST_ADDRESS,
REGISTER_LOAD,
diff --git a/lib/Target/R600/R600Defines.h b/lib/Target/R600/R600Defines.h
index 16cfcf5..72d83b0 100644
--- a/lib/Target/R600/R600Defines.h
+++ b/lib/Target/R600/R600Defines.h
@@ -92,6 +92,80 @@ namespace R600Operands {
  {0,-1,-1,-1,-1, 1, 2, 3, 4, 5,-1, 6, 7, 8, 9,-1,10,11,12,13,14,15,16,17}
};
  
+  enum VecOps {

+UPDATE_EXEC_MASK_X,
+UPDATE_PREDICATE_X,
+WRITE_X,
+OMOD_X,
+DST_REL_X,
+CLAMP_X,
+SRC0_X,
+SRC0_NEG_X,
+SRC0_REL_X,
+SRC0_ABS_X,
+SRC0_SEL_X,
+SRC1_X,
+SRC1_NEG_X,
+SRC1_REL_X,
+SRC1_ABS_X,
+SRC1_SEL_X,
+PRED_SEL_X,
+UPDATE_EXEC_MASK_Y,
+UPDATE_PREDICATE_Y,
+WRITE_Y,
+OMOD_Y,
+DST_REL_Y,
+CLAMP_Y,
+SRC0_Y,
+SRC0_NEG_Y,
+SRC0_REL_Y,
+SRC0_ABS_Y,
+SRC0_SEL_Y,
+SRC1_Y,
+SRC1_NEG_Y,
+SRC1_REL_Y,
+SRC1_ABS_Y,
+SRC1_SEL_Y,
+PRED_SEL_Y,
+UPDATE_EXEC_MASK_Z,
+UPDATE_PREDICATE_Z,
+WRITE_Z,
+OMOD_Z,
+DST_REL_Z,
+CLAMP_Z,
+SRC0_Z,
+SRC0_NEG_Z,
+SRC0_REL_Z,
+SRC0_ABS_Z,
+SRC0_SEL_Z,
+SRC1_Z,
+SRC1_NEG_Z,
+SRC1_REL_Z,
+SRC1_ABS_Z,
+SRC1_SEL_Z,
+PRED_SEL_Z,
+UPDATE_EXEC_MASK_W,
+UPDATE_PREDICATE_W,
+WRITE_W,
+OMOD_W,
+DST_REL_W,
+CLAMP_W,
+SRC0_W,
+SRC0_NEG_W,
+SRC0_REL_W,
+SRC0_ABS_W,
+SRC0_SEL_W,
+SRC1_W,
+SRC1_NEG_W,
+SRC1_REL_W,
+SRC1_ABS_W,
+SRC1_SEL_W,
+PRED_SEL_W,
+IMM_0,
+IMM_1,
+VEC_COUNT
+ };
+
  }
  
  #endif // R600DEFINES_H_

diff --git a/lib/Target/R600/R600ExpandSpecialInstrs.cpp 
b/lib/Target/R600/R600ExpandSpecialInstrs.cpp
index f8c900f..993bdad 100644
--- 

Re: [Mesa-dev] [PATCH 2/4] radeonsi: switch to using resource destribtors for constants

2013-03-15 Thread Michel Dänzer
On Don, 2013-03-14 at 16:49 +0100, Christian König wrote: 
 diff --git a/src/gallium/drivers/radeonsi/si_state.c 
 b/src/gallium/drivers/radeonsi/si_state.c
 index a395ec4..219add5 100644
 --- a/src/gallium/drivers/radeonsi/si_state.c
 +++ b/src/gallium/drivers/radeonsi/si_state.c
 [...]
 + /* Fill in a T# buffer resource description */
 + si_pm4_sh_data_add(pm4, va  0x);

The masking is superfluous.


 + si_pm4_sh_data_add(pm4, (S_008F04_BASE_ADDRESS_HI(va  32) |
 +  S_008F04_STRIDE(0)));
 + si_pm4_sh_data_add(pm4, 0x); //cb-buffer_size);

Why not use cb-buffer_size?


The rest of the radeonsi patches in this series looks good to me.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast |  Debian, X and DRI developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] glxgears is faster but 3D render is so slow

2013-03-15 Thread jupiter
Thanks Brian and Matt.

On 3/15/13, Matt Turner matts...@gmail.com wrote:
 On Thu, Mar 14, 2013 at 6:29 AM, Brian Paul bri...@vmware.com wrote:
 Hmm, I guess autoconf still has some unneeded dependencies on DRI when
 it's
 not needed.  You might try adding --with-gallium-drivers=swrast so that
 no
 DRI drivers are selected.

 Don't think so. He's just not setting --with-gallium-drivers= so he
 gets a radeon driver by default.

I did try to setup --with-gallium-drivers=llvm, it was an error, the
gallium drivers does not have llvm. I now changed to
--with-gallium-drivers=swrast. Matt is right, I don't need libdrm
anymore.

What is the correct llvm for --with-gallium-drivers? I also seen that
dri-core were also built, that could cause the problems.

The package built on CentOS 6.2 32-bit machine now included
lib/gallium, but the libGL.so and libGL.so.1 did not link to
lib/gallium/libGL.so.1.5.0. After manually linking the lib/libGL.so
and libGL.so.1 to lib/gallium/libGL.so.1.5.0, although the glxinfo
OpenGL rendering string is now pointing to the  llvmpipe, but it seems
broken the xlib driver. It stopped running my 3D application via VNC
connection. Does the LLVMPIPE use any DRI? Or is it still xlib driver?
As the VNC can only use xlib, anything bypass the xlib will break the
VNC connection.

Thank you.

Kind regards.

Jupiter
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 62357] llvmpipe: Fragment Shader with return in main causes back output

2013-03-15 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=62357

José Fonseca jfons...@vmware.com changed:

   What|Removed |Added

 CC||jfons...@vmware.com

--- Comment #5 from José Fonseca jfons...@vmware.com ---
(In reply to comment #4)
 FWIW this was using LLVM 3.2 as well.

It could be a LLVM version specific issue.  I haven't done much testing with
LLVM 3.2 (still using 3.1).  Could you re-try with LLVM 3.1?

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 9/9] tgsi: add ArrayID documentation

2013-03-15 Thread Christian König

Am 14.03.2013 15:53, schrieb Christoph Bumiller:

On 14.03.2013 15:20, Christian König wrote:

From: Christian König christian.koe...@amd.com

Signed-off-by: Christian König christian.koe...@amd.com
---
  src/gallium/docs/source/tgsi.rst |   16 
  1 file changed, 16 insertions(+)

diff --git a/src/gallium/docs/source/tgsi.rst b/src/gallium/docs/source/tgsi.rst
index d9a7fe9..27fe039 100644
--- a/src/gallium/docs/source/tgsi.rst
+++ b/src/gallium/docs/source/tgsi.rst
@@ -1833,6 +1833,22 @@ If Interpolate flag is set to 1, a Declaration 
Interpolate token follows.
  
  If file is TGSI_FILE_RESOURCE, a Declaration Resource token follows.
  
+If Array flag is set to 1, a Declaration Array token follows.

+
+Array Declaration
+
+
+Declarations can optional have an ArrayID attribute which can be referred by
+indirect addressing operands. An ArrayID of zero is reserved and treaded as
+if no ArrayID is specified.
+
+If an indirect addressing operand refers to an specific declaration by using

s/an/a


Thx, fixed.




+an ArrayID only the registers in this declaration are guaranteed to be
+accessed, accessing any register outside this declaration results in undefined
+behavior.

+ Note that the effective index is zero-based and not relative to the
specified declaration. XXX: Is it ? Should it be ?


Yes for compatibility reasons, otherwise we would need to change all 
drivers at once.





+
+If no ArrayID is specified with an indirect addressing operand the whole
+register file might be accessed by this operand.
  

+ A practice which is strongly discouraged. Don't do this if you have
more than 1 declaration for the file in question ! It will prevent
packing of scalar/vec2 arrays and effective memory alias analysis.


A bit shortened, but in general added the remark.


Packing ? Yes !
We can pack arrays if they're declared as e.g.
TEMP[0-3].xyzw
TEMP[4-31].x

And the caches will be very very thankful that we don't just access
every 4th element of our 4 times larger than it needs to be buffer !!!

And if your card can't do that, pleeease be nice and still make it
possible for other drivers. :o3


It is probably possible with the new information to do so, but not 
priority for me cause I primary need it for our LLVM backend.


Christian.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] glxgears is faster but 3D render is so slow

2013-03-15 Thread Brian Paul

On 03/15/2013 05:39 AM, jupiter wrote:

Thanks Brian and Matt.

On 3/15/13, Matt Turnermatts...@gmail.com  wrote:

On Thu, Mar 14, 2013 at 6:29 AM, Brian Paulbri...@vmware.com  wrote:

Hmm, I guess autoconf still has some unneeded dependencies on DRI when
it's
not needed.  You might try adding --with-gallium-drivers=swrast so that
no
DRI drivers are selected.


Don't think so. He's just not setting --with-gallium-drivers= so he
gets a radeon driver by default.


I did try to setup --with-gallium-drivers=llvm, it was an error, the
gallium drivers does not have llvm. I now changed to
--with-gallium-drivers=swrast. Matt is right, I don't need libdrm
anymore.

What is the correct llvm for --with-gallium-drivers? I also seen that
dri-core were also built, that could cause the problems.

The package built on CentOS 6.2 32-bit machine now included
lib/gallium, but the libGL.so and libGL.so.1 did not link to
lib/gallium/libGL.so.1.5.0. After manually linking the lib/libGL.so
and libGL.so.1 to lib/gallium/libGL.so.1.5.0, although the glxinfo
OpenGL rendering string is now pointing to the  llvmpipe, but it seems
broken the xlib driver. It stopped running my 3D application via VNC
connection. Does the LLVMPIPE use any DRI?


No.



Or is it still xlib driver?


llvmpipe uses Xlib only.



As the VNC can only use xlib, anything bypass the xlib will break the
VNC connection.


Do other OpenGL apps run OK with llvmpipe or is it just Chimera that's 
not working?  How exactly is Chimera failing/broken?


-Brian
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 62357] llvmpipe: Fragment Shader with return in main causes back output

2013-03-15 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=62357

--- Comment #6 from Brian Paul bri...@vmware.com ---
My piglit test was with LLVM 3.2.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] wayland: fix segfault when using software rendering

2013-03-15 Thread Brian Paul

On 03/08/2013 01:32 PM, Philipp Brüschweiler wrote:

wayland_roundtrip() was given an incorrect parameter.
---
  src/gallium/state_trackers/egl/wayland/native_shm.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/egl/wayland/native_shm.c 
b/src/gallium/state_trackers/egl/wayland/native_shm.c
index a959237..2499677 100644
--- a/src/gallium/state_trackers/egl/wayland/native_shm.c
+++ b/src/gallium/state_trackers/egl/wayland/native_shm.c
@@ -163,7 +163,7 @@ wayland_shm_display_init_screen(struct native_display *ndpy)
return FALSE;

 if (shmdpy-base.formats == 0)
-  wayland_roundtrip(shmdpy-base.dpy);
+  wayland_roundtrip(shmdpy-base);
 if (shmdpy-base.formats == 0)
return FALSE;



Thanks.  Looks good to me and it matches the patch posted in 
https://bugs.freedesktop.org/show_bug.cgi?id=62362


I'll push this soon.

-Brian
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 62362] Crash when using Wayland EGL platform

2013-03-15 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=62362

Brian Paul bri...@vmware.com changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #1 from Brian Paul bri...@vmware.com ---
The same patch was posted earlier to the Mesa list.
Fixed with commit c07c18081e3b21070c7db3aea0c7a31a31ff20ce

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 2868] Mesa seems to be unable to handle 30bit TrueColor visuals

2013-03-15 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=2868

--- Comment #4 from chemtech patsev.an...@gmail.com ---
Roland Mainz,
Do you still experience this issue with newer drivers ?
Please check the status of your issue.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 62357] llvmpipe: Fragment Shader with return in main causes back output

2013-03-15 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=62357

--- Comment #7 from Roland Scheidegger srol...@vmware.com ---
After a quick look at the generated IR, it indeed seems broken.
However, it is broken the opposite way, that is the early exit path is ok but
if the path isn't taken it will pick zero as the output color.
So if you swap the test like so:
[test]
uniform vec4 v 1 0 1 0

Then it fails.

tgsi looks like this:
  0: SLT TEMP[0].x, CONST[0]., IMM[0].
  1: F2I TEMP[0].x, -TEMP[0]
  2: SLT TEMP[1].x, IMM[0]., CONST[0].
  3: F2I TEMP[1].x, -TEMP[1]
  4: OR TEMP[1].x, TEMP[0]., TEMP[1].
  5: SLT TEMP[2].x, CONST[0]., IMM[0].
  6: F2I TEMP[2].x, -TEMP[2]
  7: OR TEMP[1].x, TEMP[1]., TEMP[2].
  8: IF TEMP[1]. :0
  9:   MOV_SAT OUT[0], CONST[0]
 10:   RET
 11: ENDIF
 12: ADD TEMP[0], IMM[0]., -CONST[0]
 13: MOV_SAT OUT[0], TEMP[0]
 14: END

And the IR part like this:
  %53 = fcmp une 8 x float %52, zeroinitializer
  %54 = sext 8 x i1 %53 to 8 x i32
// %54 is the result of the if condition of line 8
...
  %70 = call 8 x float @llvm.x86.avx.max.ps.256(8 x float %57, 8 x float
zeroinitializer)
  %71 = call 8 x float @llvm.x86.avx.min.ps.256(8 x float %70, 8 x float
float 1.00e+00, float 1.00e+00, float 1.00e+00, float
1.00e+00, float 1.00e+00, float 1.00e+00, float 1.00e+00, float
1.00e+00)
// %71 contains mov_sat of line 9
  %72 = bitcast 8 x i32 %54 to 8 x float
// %72 is still condition of line 8 as float
  %73 = call 8 x float @llvm.x86.avx.blendv.ps.256(8 x float
zeroinitializer, 8 x float %71, 8 x float %72)
// %73 contains the output going to color pack - you can see that if the
condition (%72) wasn't true, it simply selects zero.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] RFC: C11 threads.h

2013-03-15 Thread Brian Paul

On 03/14/2013 05:10 PM, Jose Fonseca wrote:

Mesa source tree currently has 4 abstraction of threading primitives (in 
gallium, glapi, mapi, and egl components).

I'd like to unify all them, and since now the C11 standard introduced a 
threads.h header, I'd like to use that as model.

So for I've imported a C11 threads.h implementation from 
https://gist.github.com/yohhoy/2223710 , and reimplemented all

   http://cgit.freedesktop.org/~jrfonseca/mesa/log/?h=c11-threads

If there are no objections, the next step would be to eliminate all threading 
abstractions and just use C11 threads.h primitives.

The only snafu is the wide spread use of static mutex initializers. It's not 
supported by C11 threads -- I believe one needs to do the initialization via 
once_init paradigm instead. But for now I just side stepped the problem by 
defining a non standard initializer. I'll revisit this later.

A nice side-benefit of this is that the pre-vista Win32 conditional var 
implement should be much better than what we have now, which should speed up 
multithreaded llvmpipe on windows substantially.

The same principle could be applied to atomic operations.


This looks OK to me.

Could we eventually go a step further and replace _EGLMutex, 
pipe_mutex, u_mutex, etc. with mtx_t (and similar with thrd_t, cnd_t, 
etc)?


-Brian
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 8701] Incorrect line clipping with very large coordinates

2013-03-15 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=8701

--- Comment #3 from chemtech patsev.an...@gmail.com ---
Brian Paul
Do you still experience this issue with newer soft ?
Please check the status of your issue.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 9922] Services for Unix compile

2013-03-15 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=9922

--- Comment #1 from chemtech patsev.an...@gmail.com ---
f...@lanl.gov
Do you still experience this issue with newer soft ?
Please check the status of your issue.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] RFC: C11 threads.h

2013-03-15 Thread Jose Fonseca


- Original Message -
 On 03/14/2013 05:10 PM, Jose Fonseca wrote:
  Mesa source tree currently has 4 abstraction of threading primitives (in
  gallium, glapi, mapi, and egl components).
 
  I'd like to unify all them, and since now the C11 standard introduced a
  threads.h header, I'd like to use that as model.
 
  So for I've imported a C11 threads.h implementation from
  https://gist.github.com/yohhoy/2223710 , and reimplemented all
 
 http://cgit.freedesktop.org/~jrfonseca/mesa/log/?h=c11-threads
 
  If there are no objections, the next step would be to eliminate all
  threading abstractions and just use C11 threads.h primitives.
 
  The only snafu is the wide spread use of static mutex initializers. It's
  not supported by C11 threads -- I believe one needs to do the
  initialization via once_init paradigm instead. But for now I just side
  stepped the problem by defining a non standard initializer. I'll revisit
  this later.
 
  A nice side-benefit of this is that the pre-vista Win32 conditional var
  implement should be much better than what we have now, which should speed
  up multithreaded llvmpipe on windows substantially.
 
  The same principle could be applied to atomic operations.
 
 This looks OK to me.

Thanks.
 
 Could we eventually go a step further and replace _EGLMutex,
 pipe_mutex, u_mutex, etc. with mtx_t (and similar with thrd_t, cnd_t,
 etc)?

Definitely. That would be my next step.

Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] R600: Relax some vector constraints on Dot4.

2013-03-15 Thread Vincent Lejeune
Hi Christian,

LLVM does indeed coalesce registers for R600 targets, I was however thinking of 
copies between vectors.
For instance, let's say you have 4 vectors coming from instructions that only 
emit vectors (like TEX_SAMPLE iirc) :

If the shader wants to mix them before doing dp4, you end with something like 
:

T0_XYZW = TEX_SAMPLE
T1_XYZW = TEX_SAMPLE
T2_XYZW = TEX_SAMPLE
T3_XYZW = TEX_SAMPLE
T0_W = COPY T4_W
T1_Z =  COPY T3_Z
DOT4 T0_XYZW, T1_XYZW

From hw point of view, the 2 copies are not necessary because DOT4 
instructions does not require that its operands belong to the same 128 bits 
register.
It's perfectly legal to have a bundle like this one :

Dot4_eg_real T0_X T1_X
Dot4_eg_real T0_Y T1_Y
Dot4_eg_real T0_Z T3_Z
Dot4_eg_real T4_W T1_W

(In fact it is even possible to remove the R600_TReg32_* constraints on the 
inputs but then you have to ensure the bundle does not read more than
3 gprs from a channel which need much more work)

The previous case may seem not so frequent but it still occurs in Lightmark and 
Unigine Heaven.

We represent dot4 inputs as vectors but using 8 scalar inputs is closer from hw 
capabilities, that's why I wrote this patch. Besides, scalar values usually
have shorter live interval, lowering register pressure. Shaders that have a dp4 
instructions often end up consuming less registers with this patch.

Vincent




- Mail original -
 De : Christian König deathsim...@vodafone.de
 À : Vincent Lejeune v...@ovi.com
 Cc : llvm-comm...@cs.uiuc.edu; mesa-dev@lists.freedesktop.org
 Envoyé le : Vendredi 15 mars 2013 11h18
 Objet : Re: [Mesa-dev] [PATCH] R600: Relax some vector constraints on Dot4.
 
 Hi Vincent,
 
 while I really appreciate your work, I think you're development is going 
 into the wrong direction here. Those copies you're trying to avoid (not only 
 with this patch, but also with the previous REG_SEQUENCE patches), shouldn't 
 happen in the first place. I'm not so deeply into the R600 part of our LLVM 
 backend that I can say that I'm 100% sure, but to me that just looks like 
 workarounds to an incorrect defined register space.
 
 Here is an simple example from SI, that should show how things are intended 
 to 
 work. It's a simple 2D texture fetch, the coordinates of that this fetch are 
 usually provided in an two element vector build of VGPRs (I use a 2D fetch 
 just 
 for simplicity, a 3D fetch with explicit LOD would work the same way and 
 would 
 use a four element vector).
 
 After ISel the assembler code starts with something like this (simplified):
 ...
 %vreg13def,tied1 = V_INTERP_P2_F32 ...
 ...
 %vreg17def,tied1 = V_INTERP_P2_F32 ...
 ...
 %vreg22def = IMPLICIT_DEF; VReg_64:%vreg22
 %vreg21def,tied1 = INSERT_SUBREG %vreg22tied0, 
 %vreg13kill, sub0; VReg_64:%vreg21,%vreg22 VReg_32:%vreg13
 %vreg23def,tied1 = INSERT_SUBREG %vreg21tied0, 
 %vreg17kill, sub1; VReg_64:%vreg23,%vreg21 VReg_32:%vreg17
 %vreg24def = IMAGE_SAMPLE 15, 0, 0, 0, 0, 0, 0, 0, %vreg23kill, 
 
 
 As you can see the sub components of the vectors are inserted/extracted just 
 like it happens on R600, but the registerallocater is capable of handling 
 that 
 much better than on R600 and so avoiding the (sometimes quite expensive) COPY 
 operations in the first place. The resulting code looks like this:
 
 ...
 %vreg23:sub0def,tied1 = V_INTERP_P2_F32 ...
 ...
 %vreg23:sub1def,tied1 = V_INTERP_P2_F32 ...
 ...
 %vreg24def = IMAGE_SAMPLE 15, 0, 0, 0, 0, 0, 0, 0, %vreg23, ...
 
 So INSERT_SUBREG isn't replaced with a COPY like on R600, but instead the 
 V_INTERP_P2_F32 instructions can write directly to the appropriate sub 
 register 
 component.
 
 I'm not 100% sure why this doesn't work the same way on R600, but I 
 think it might be a good idea figuring that out.
 
 Cheers,
 Christian.
 
 Am 14.03.2013 21:51, schrieb Vincent Lejeune:
  Dot4 now uses 8 scalar operands instead of 2 vectors one which allows 
 register
  coalescer to remove some unneeded COPY.
  This patch also defines some structures/functions that can be used to 
 handle
  every vector instructions (CUBE, Cayman special instructions...) in a 
 similar
  fashion.
  ---
    lib/Target/R600/AMDGPUISelLowering.h        |  1 +
    lib/Target/R600/R600Defines.h               | 74 
    lib/Target/R600/R600ExpandSpecialInstrs.cpp | 25 
    lib/Target/R600/R600ISelLowering.cpp        | 21 +++
    lib/Target/R600/R600InstrInfo.cpp           | 88 
 +
    lib/Target/R600/R600InstrInfo.h             |  5 ++
    lib/Target/R600/R600Instructions.td         | 51 -
    lib/Target/R600/R600MachineScheduler.cpp    |  2 +
    8 files changed, 266 insertions(+), 1 deletion(-)
 
  diff --git a/lib/Target/R600/AMDGPUISelLowering.h 
 b/lib/Target/R600/AMDGPUISelLowering.h
  index f31b646..f9f5a60 100644
  --- a/lib/Target/R600/AMDGPUISelLowering.h
  +++ b/lib/Target/R600/AMDGPUISelLowering.h
  @@ -125,6 +125,7 @@ enum {
      SMIN,
      UMIN,
    

Re: [Mesa-dev] [PATCH] R600: Relax some vector constraints on Dot4.

2013-03-15 Thread Christian König

Ok that makes more sense, thx for the explanation.

I was just wondering why all this stuff is needed, and as I said, I'm 
not so deeply into the R600 part of the backend.


And I strongly agree that we somehow need to better teach the backend 
about our vector slots and all those limitations (Sometimes I'm really 
happy to work on SI instead).


Christian.

Am 15.03.2013 15:52, schrieb Vincent Lejeune:

Hi Christian,

LLVM does indeed coalesce registers for R600 targets, I was however thinking of 
copies between vectors.
For instance, let's say you have 4 vectors coming from instructions that only 
emit vectors (like TEX_SAMPLE iirc) :

If the shader wants to mix them before doing dp4, you end with something like 
:

T0_XYZW = TEX_SAMPLE
T1_XYZW = TEX_SAMPLE
T2_XYZW = TEX_SAMPLE
T3_XYZW = TEX_SAMPLE
T0_W = COPY T4_W
T1_Z =  COPY T3_Z
DOT4 T0_XYZW, T1_XYZW

 From hw point of view, the 2 copies are not necessary because DOT4 
instructions does not require that its operands belong to the same 128 bits 
register.
It's perfectly legal to have a bundle like this one :

Dot4_eg_real T0_X T1_X
Dot4_eg_real T0_Y T1_Y
Dot4_eg_real T0_Z T3_Z
Dot4_eg_real T4_W T1_W

(In fact it is even possible to remove the R600_TReg32_* constraints on the 
inputs but then you have to ensure the bundle does not read more than
3 gprs from a channel which need much more work)

The previous case may seem not so frequent but it still occurs in Lightmark and 
Unigine Heaven.

We represent dot4 inputs as vectors but using 8 scalar inputs is closer from hw 
capabilities, that's why I wrote this patch. Besides, scalar values usually
have shorter live interval, lowering register pressure. Shaders that have a dp4 
instructions often end up consuming less registers with this patch.

Vincent




- Mail original -

De : Christian König deathsim...@vodafone.de
À : Vincent Lejeune v...@ovi.com
Cc : llvm-comm...@cs.uiuc.edu; mesa-dev@lists.freedesktop.org
Envoyé le : Vendredi 15 mars 2013 11h18
Objet : Re: [Mesa-dev] [PATCH] R600: Relax some vector constraints on Dot4.

Hi Vincent,

while I really appreciate your work, I think you're development is going
into the wrong direction here. Those copies you're trying to avoid (not only
with this patch, but also with the previous REG_SEQUENCE patches), shouldn't
happen in the first place. I'm not so deeply into the R600 part of our LLVM
backend that I can say that I'm 100% sure, but to me that just looks like
workarounds to an incorrect defined register space.

Here is an simple example from SI, that should show how things are intended to
work. It's a simple 2D texture fetch, the coordinates of that this fetch are
usually provided in an two element vector build of VGPRs (I use a 2D fetch just
for simplicity, a 3D fetch with explicit LOD would work the same way and would
use a four element vector).

After ISel the assembler code starts with something like this (simplified):
...
%vreg13def,tied1 = V_INTERP_P2_F32 ...
...
%vreg17def,tied1 = V_INTERP_P2_F32 ...
...
%vreg22def = IMPLICIT_DEF; VReg_64:%vreg22
%vreg21def,tied1 = INSERT_SUBREG %vreg22tied0,
%vreg13kill, sub0; VReg_64:%vreg21,%vreg22 VReg_32:%vreg13
%vreg23def,tied1 = INSERT_SUBREG %vreg21tied0,
%vreg17kill, sub1; VReg_64:%vreg23,%vreg21 VReg_32:%vreg17
%vreg24def = IMAGE_SAMPLE 15, 0, 0, 0, 0, 0, 0, 0, %vreg23kill,


As you can see the sub components of the vectors are inserted/extracted just
like it happens on R600, but the registerallocater is capable of handling that
much better than on R600 and so avoiding the (sometimes quite expensive) COPY
operations in the first place. The resulting code looks like this:

...
%vreg23:sub0def,tied1 = V_INTERP_P2_F32 ...
...
%vreg23:sub1def,tied1 = V_INTERP_P2_F32 ...

%vreg24def = IMAGE_SAMPLE 15, 0, 0, 0, 0, 0, 0, 0, %vreg23, ...

So INSERT_SUBREG isn't replaced with a COPY like on R600, but instead the
V_INTERP_P2_F32 instructions can write directly to the appropriate sub register
component.

I'm not 100% sure why this doesn't work the same way on R600, but I
think it might be a good idea figuring that out.

Cheers,
Christian.

Am 14.03.2013 21:51, schrieb Vincent Lejeune:

  Dot4 now uses 8 scalar operands instead of 2 vectors one which allows

register

  coalescer to remove some unneeded COPY.
  This patch also defines some structures/functions that can be used to

handle

  every vector instructions (CUBE, Cayman special instructions...) in a

similar

  fashion.
  ---
lib/Target/R600/AMDGPUISelLowering.h|  1 +
lib/Target/R600/R600Defines.h   | 74 
lib/Target/R600/R600ExpandSpecialInstrs.cpp | 25 
lib/Target/R600/R600ISelLowering.cpp| 21 +++
lib/Target/R600/R600InstrInfo.cpp   | 88

+

lib/Target/R600/R600InstrInfo.h |  5 ++
lib/Target/R600/R600Instructions.td | 51 -
lib/Target/R600/R600MachineScheduler.cpp|  2 +
 

Re: [Mesa-dev] Google Summer of Code ideas needed

2013-03-15 Thread Vincent Lejeune
Hi,

If LLVM backend development is allowed, maybe a student could work on improving 
VLIW5 scheduling for R600 hardware.
So far I focused on VLIW4 architecture, but extending the scheduler to support 
Trans ALU wouldn't be too hard.
This would require a way to represent Trans slot compatibility for instruction 
in R600Instructions.td, check for 

additionnal constants read/literals limitation on this slot, and modifying a 
couple of functions inside R600MachineScheduler.cpp.
This may look like a short task but the student would also need some time to 
get used to all the tools we use, like piglit, and to 

understand llvm codebase.



- Mail original -
 De : Tom Stellard t...@stellard.net
 À : mesa-dev@lists.freedesktop.org
 Cc : 
 Envoyé le : Mercredi 13 mars 2013 18h11
 Objet : [Mesa-dev] Google Summer of Code ideas needed
 
 Hi,
 
 It's time again for Google Summer of Code, so we need to start updating
 the X.Org ideas page (http://www.x.org/wiki/SummerOfCodeIdeas) with new
 ideas.  Since there have been a few issues with the wikis lately, if you
 have any ideas please respond to this thread, and I will make sure they
 get onto the official ideas page (but still feel free to update the wiki
 page yourself if you can).  A good project description should contain:
 
 - A brief description of the project
 - A difficulty rating (e.g. easy, medium, hard)
 - The skills / programming languages required
 
 Also, I am going to purge all the old ideas from the ideas page in the
 next week, so if there are any of the old ideas that you think are
 still relevant, let me know and I will keep it.
 
 The ideas page is used as one of the criteria by Google for selecting
 mentoring organizations and part of the reason X.Org was not selected
 last year was that the ideas page was not up to par, so if we want to
 participate in Google Summer of Code this year, it is important we
 have a good ideas page with lots of ideas.
 
 Thanks,
 Tom Stellard
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 8701] Incorrect line clipping with very large coordinates

2013-03-15 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=8701

--- Comment #4 from Brian Paul bri...@vmware.com ---
Unfortunately, swrast, softpipe and llvmpipe all still suffer from this.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] scons: Warn when using MSVS versions prior to 2012.

2013-03-15 Thread jfonseca
From: José Fonseca jfons...@vmware.com

---
 scons/gallium.py |2 ++
 1 file changed, 2 insertions(+)

diff --git a/scons/gallium.py b/scons/gallium.py
index 4d3de82..b28be5d 100755
--- a/scons/gallium.py
+++ b/scons/gallium.py
@@ -402,6 +402,8 @@ def generate(env):
   '/Oi', # enable intrinsic functions
 ]
 else:
+if distutils.version.LooseVersion(env['MSVC_VERSION'])  
distutils.version.LooseVersion('11.0'):
+print 'scons: warning: Visual Studio versions prior to 2012 
are known to produce incorrect code when optimizations are enabled ( 
https://bugs.freedesktop.org/show_bug.cgi?id=58718 )'
 ccflags += [
 '/O2', # optimize for speed
 ]
-- 
1.7.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] scons: Warn when using MSVS versions prior to 2012.

2013-03-15 Thread Brian Paul

On 03/15/2013 09:24 AM, jfons...@vmware.com wrote:

From: José Fonsecajfons...@vmware.com

---
  scons/gallium.py |2 ++
  1 file changed, 2 insertions(+)

diff --git a/scons/gallium.py b/scons/gallium.py
index 4d3de82..b28be5d 100755
--- a/scons/gallium.py
+++ b/scons/gallium.py
@@ -402,6 +402,8 @@ def generate(env):
'/Oi', # enable intrinsic functions
  ]
  else:
+if distutils.version.LooseVersion(env['MSVC_VERSION'])  
distutils.version.LooseVersion('11.0'):
+print 'scons: warning: Visual Studio versions prior to 2012 
are known to produce incorrect code when optimizations are enabled ( 
https://bugs.freedesktop.org/show_bug.cgi?id=58718 )'
  ccflags += [
  '/O2', # optimize for speed
  ]


Reviewed-by: Brian Paul bri...@vmware.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 6/6] gallium, st/mesa: don't use blit-based transfers with software rasterizers

2013-03-15 Thread Brian Paul

On 03/14/2013 04:57 PM, Marek Olšák wrote:

On Thu, Mar 14, 2013 at 8:29 PM, Brian Paulbri...@vmware.com  wrote:

On 03/14/2013 12:45 PM, Marek Olšák wrote:


The blit-based paths for TexImage, GetTexImage, and ReadPixels aren't very
fast with software rasterizer. Now Gallium drivers have the ability to
turn
them off.
---



For the rest of the series:

Reviewed-by: Brian Paulbri...@vmware.com


Can you give me a day or two to test your branch with the VMware driver?


Sure. Please let me know when you're done.


OK, it seems alright with the svga driver.  With a debug build, it's 
just slightly slower than before.  I'll try a release build later.


Tested-by: Brian Paul bri...@vmware.com


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Don't print a fatal-looking message if intelCreateContext fails.

2013-03-15 Thread Eric Anholt
Kenneth Graunke kenn...@whitecape.org writes:

 With the old context creation mechanism, an application asked the GL to
 give it a context.  Failing to produce a context was a fatal error.

 Now, with GLX_ARB_create_context, the application can request a specific
 version.  If it's higher than the maximum version we support, context
 creation will fail.  But this is a normal error that applications
 recover from.

 In particular, the new glxinfo tries to create OpenGL 4.3, 4.2, 4.1,
 4.0, 3.3, and 3.2 contexts before finally succeeding at creating a 3.1
 context.  This led to it printing the following message 6 times:
 brwCreateContext: failed to init intel context

 There's no need to alarm users (and developers) with such a message.

Reviewed-by: Eric Anholt e...@anholt.net


pgpFiIGP1w0KP.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] mesa on solaris

2013-03-15 Thread Kenneth Graunke

On 03/15/2013 09:49 AM, Niveditha Rau wrote:

Hi,

We have been building Mesa 8.0.4 successfully on solaris and I was
trying to update to Mesa 9.0.3, but was running
into build issues with glsl and I ran into the errors with the builtins

./builtins/tools/generate_builtins.py ./builtin_compiler  output


Python shouldn't be too relevant; it's clearly calling builtin_compiler, 
and that's failing.


What messages do you get if you run ./builtin_compiler 
builtins/profiles/110.vert?  Does it even run?


--Ken
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/4] Indirect addressing for radeonsi

2013-03-15 Thread Tom Stellard
On Thu, Mar 14, 2013 at 04:49:35PM +0100, Christian König wrote:
 Hi,
 
 as promised here is the mesa part of my indirect addressing patchset for 
 radeonsi.
 
 Based on the TGSI changes I've send out earlier today.


For the series:
Reviewed-by: Tom Stellard thomas.stell...@amd.com
 
 Please review,
 Christian.
 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] mesa on solaris

2013-03-15 Thread Niveditha Rau

On 03/15/13 10:38, Kenneth Graunke wrote:

./builtin_compiler builtins/profiles/110.vert

% ./builtin_compiler builtins/profiles/110.vert
Info log for builtins/profiles/110.vert:
0:1(1): error: syntax error, unexpected $end


Thanks
Niveditha

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 9/9] tgsi: add ArrayID documentation

2013-03-15 Thread Christoph Bumiller
On 15.03.2013 13:08, Christian König wrote:
 Am 14.03.2013 15:53, schrieb Christoph Bumiller:
 On 14.03.2013 15:20, Christian König wrote:
 From: Christian König christian.koe...@amd.com

 Signed-off-by: Christian König christian.koe...@amd.com
 ---
   src/gallium/docs/source/tgsi.rst |   16 
   1 file changed, 16 insertions(+)

 diff --git a/src/gallium/docs/source/tgsi.rst
 b/src/gallium/docs/source/tgsi.rst
 index d9a7fe9..27fe039 100644
 --- a/src/gallium/docs/source/tgsi.rst
 +++ b/src/gallium/docs/source/tgsi.rst
 @@ -1833,6 +1833,22 @@ If Interpolate flag is set to 1, a
 Declaration Interpolate token follows.
 If file is TGSI_FILE_RESOURCE, a Declaration Resource token
 follows.
   +If Array flag is set to 1, a Declaration Array token follows.
 +
 +Array Declaration
 +
 +
 +Declarations can optional have an ArrayID attribute which can be
 referred by
 +indirect addressing operands. An ArrayID of zero is reserved and
 treaded as
 +if no ArrayID is specified.
 +
 +If an indirect addressing operand refers to an specific declaration
 by using
 s/an/a

 Thx, fixed.


 +an ArrayID only the registers in this declaration are guaranteed to be
 +accessed, accessing any register outside this declaration results
 in undefined
 +behavior.
 + Note that the effective index is zero-based and not relative to the
 specified declaration. XXX: Is it ? Should it be ?

 Yes for compatibility reasons, otherwise we would need to change all
 drivers at once.


 +
 +If no ArrayID is specified with an indirect addressing operand the
 whole
 +register file might be accessed by this operand.
   
 + A practice which is strongly discouraged. Don't do this if you have
 more than 1 declaration for the file in question ! It will prevent
 packing of scalar/vec2 arrays and effective memory alias analysis.

 A bit shortened, but in general added the remark.

 Packing ? Yes !
 We can pack arrays if they're declared as e.g.
 TEMP[0-3].xyzw
 TEMP[4-31].x

 And the caches will be very very thankful that we don't just access
 every 4th element of our 4 times larger than it needs to be buffer !!!

 And if your card can't do that, pleeease be nice and still make it
 possible for other drivers. :o3

 It is probably possible with the new information to do so, but not
 priority for me cause I primary need it for our LLVM backend.

At some point you'll be able to make use of the info in your backend,
too, and then you'll regret having to refamiliarize with this code just
because you didn't add the extra (estimated) 2 lines to set the UsageMask.

Also, NAK from me until array access/declarations for the other files
follows suit.
Sorry for being so ... pesky, but I'd really like this change to be 100%
complete. Come on, doesn't it nag on your conscience if this is left to
remain only a few smalls steps from perfection ?

 Christian.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 62357] llvmpipe: Fragment Shader with return in main causes back output

2013-03-15 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=62357

--- Comment #8 from Roland Scheidegger srol...@vmware.com ---
Looks like we unconditionally return when we see a return not in a function.
Some trivial attempt at fixing that seems to work but I'm not quite sure if
it's right or if we need to do more.

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
index 0dc26b5..e56bb62 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
@@ -348,7 +348,8 @@ static void lp_exec_mask_ret(struct lp_exec_mask *mask, int
*pc)
LLVMBuilderRef builder = mask-bld-gallivm-builder;
LLVMValueRef exec_mask;

-   if (mask-call_stack_size == 0) {
+   if (mask-call_stack_size == 0 
+   mask-cond_stack_size == 0) {
   /* returning from main() */
   *pc = -1;
   return;

I guess might also need to check loop_stack_size too (could have a loop which
never executes).

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] gallivm: fix returning unconditionally from main on TGSI_OPCODE_RET

2013-03-15 Thread sroland
From: Roland Scheidegger srol...@vmware.com

If we're in some conditional we must not return, or the code after
the condition is never executed.
(Probably the same for loops.)
This fixes https://bugs.freedesktop.org/show_bug.cgi?id=62357.

Note: This is a candidate for the stable branches.
---
 src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c |4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
index 0dc26b5..b5f0ace 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
@@ -348,7 +348,9 @@ static void lp_exec_mask_ret(struct lp_exec_mask *mask, int 
*pc)
LLVMBuilderRef builder = mask-bld-gallivm-builder;
LLVMValueRef exec_mask;
 
-   if (mask-call_stack_size == 0) {
+   if (mask-call_stack_size == 0 
+   mask-cond_stack_size == 0 
+   mask-loop_stack_size == 0) {
   /* returning from main() */
   *pc = -1;
   return;
-- 
1.7.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 62357] llvmpipe: Fragment Shader with return in main causes back output

2013-03-15 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=62357

--- Comment #9 from Brian Paul bri...@vmware.com ---
Nice find, Roland.  Do you want fix up my test case and put it into piglit?

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallivm: fix returning unconditionally from main on TGSI_OPCODE_RET

2013-03-15 Thread Brian Paul

On 03/15/2013 12:34 PM, srol...@vmware.com wrote:

From: Roland Scheideggersrol...@vmware.com

If we're in some conditional we must not return, or the code after
the condition is never executed.
(Probably the same for loops.)
This fixes https://bugs.freedesktop.org/show_bug.cgi?id=62357.

Note: This is a candidate for the stable branches.
---
  src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c |4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
index 0dc26b5..b5f0ace 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
@@ -348,7 +348,9 @@ static void lp_exec_mask_ret(struct lp_exec_mask *mask, int 
*pc)
 LLVMBuilderRef builder = mask-bld-gallivm-builder;
 LLVMValueRef exec_mask;

-   if (mask-call_stack_size == 0) {
+   if (mask-call_stack_size == 0
+   mask-cond_stack_size == 0
+   mask-loop_stack_size == 0) {
/* returning from main() */
*pc = -1;
return;


Reviewed-by: Brian Paul bri...@vmware.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] libclc: Add max() builtin function

2013-03-15 Thread Tom Stellard
On Thu, Mar 14, 2013 at 09:12:57PM -0500, Aaron Watry wrote:
 Adds this function for both int and floating data types.
 ---
  generic/include/clc/clc.h   |2 ++
  generic/include/clc/integer/max.h   |2 ++
  generic/include/clc/integer/max.inc |1 +
  generic/include/clc/math/max.h  |2 ++
  generic/include/clc/math/max.inc|1 +

I think the math/max builtin should be moved to the common directory
(this doesn't exist, you will need to create it) to match the spec.

-Tom

  generic/lib/SOURCES |2 ++
  generic/lib/integer/max.cl  |4 
  generic/lib/integer/max.inc |3 +++
  generic/lib/math/max.cl |8 
  generic/lib/math/max.inc|3 +++
  10 files changed, 28 insertions(+)
  create mode 100644 generic/include/clc/integer/max.h
  create mode 100644 generic/include/clc/integer/max.inc
  create mode 100644 generic/include/clc/math/max.h
  create mode 100644 generic/include/clc/math/max.inc
  create mode 100644 generic/lib/integer/max.cl
  create mode 100644 generic/lib/integer/max.inc
  create mode 100644 generic/lib/math/max.cl
  create mode 100644 generic/lib/math/max.inc
 
 diff --git a/generic/include/clc/clc.h b/generic/include/clc/clc.h
 index 4394c9e..f6668a3 100644
 --- a/generic/include/clc/clc.h
 +++ b/generic/include/clc/clc.h
 @@ -45,6 +45,7 @@
  #include clc/math/log.h
  #include clc/math/log2.h
  #include clc/math/mad.h
 +#include clc/math/max.h
  #include clc/math/pow.h
  #include clc/math/sin.h
  #include clc/math/sqrt.h
 @@ -63,6 +64,7 @@
  #include clc/integer/abs.h
  #include clc/integer/abs_diff.h
  #include clc/integer/add_sat.h
 +#include clc/integer/max.h
  #include clc/integer/sub_sat.h
  
  /* 6.11.5 Geometric Functions */
 diff --git a/generic/include/clc/integer/max.h 
 b/generic/include/clc/integer/max.h
 new file mode 100644
 index 000..e74a459
 --- /dev/null
 +++ b/generic/include/clc/integer/max.h
 @@ -0,0 +1,2 @@
 +#define BODY clc/integer/max.inc
 +#include clc/integer/gentype.inc
 diff --git a/generic/include/clc/integer/max.inc 
 b/generic/include/clc/integer/max.inc
 new file mode 100644
 index 000..ce6c6d0
 --- /dev/null
 +++ b/generic/include/clc/integer/max.inc
 @@ -0,0 +1 @@
 +_CLC_OVERLOAD _CLC_DECL GENTYPE max(GENTYPE a, GENTYPE b);
 diff --git a/generic/include/clc/math/max.h b/generic/include/clc/math/max.h
 new file mode 100644
 index 000..3d158f1
 --- /dev/null
 +++ b/generic/include/clc/math/max.h
 @@ -0,0 +1,2 @@
 +#define BODY clc/math/max.inc
 +#include clc/math/gentype.inc
 diff --git a/generic/include/clc/math/max.inc 
 b/generic/include/clc/math/max.inc
 new file mode 100644
 index 000..ce6c6d0
 --- /dev/null
 +++ b/generic/include/clc/math/max.inc
 @@ -0,0 +1 @@
 +_CLC_OVERLOAD _CLC_DECL GENTYPE max(GENTYPE a, GENTYPE b);
 diff --git a/generic/lib/SOURCES b/generic/lib/SOURCES
 index 86c008b..b593941 100644
 --- a/generic/lib/SOURCES
 +++ b/generic/lib/SOURCES
 @@ -7,6 +7,7 @@ integer/abs.cl
  integer/add_sat.cl
  integer/add_sat.ll
  integer/add_sat_impl.ll
 +integer/max.cl
  integer/sub_sat.cl
  integer/sub_sat.ll
  integer/sub_sat_impl.ll
 @@ -14,6 +15,7 @@ math/fmax.cl
  math/fmin.cl
  math/hypot.cl
  math/mad.cl
 +math/max.cl
  relational/any.cl
  workitem/get_global_id.cl
  workitem/get_global_size.cl
 diff --git a/generic/lib/integer/max.cl b/generic/lib/integer/max.cl
 new file mode 100644
 index 000..89fec7c
 --- /dev/null
 +++ b/generic/lib/integer/max.cl
 @@ -0,0 +1,4 @@
 +#include clc/clc.h
 +
 +#define BODY max.inc
 +#include clc/integer/gentype.inc
 diff --git a/generic/lib/integer/max.inc b/generic/lib/integer/max.inc
 new file mode 100644
 index 000..37409fc
 --- /dev/null
 +++ b/generic/lib/integer/max.inc
 @@ -0,0 +1,3 @@
 +_CLC_OVERLOAD _CLC_DEF GENTYPE max(GENTYPE a, GENTYPE b) {
 +  return (a  b ? a : b);
 +}
 diff --git a/generic/lib/math/max.cl b/generic/lib/math/max.cl
 new file mode 100644
 index 000..d1254a7
 --- /dev/null
 +++ b/generic/lib/math/max.cl
 @@ -0,0 +1,8 @@
 +#include clc/clc.h
 +
 +#ifdef cl_khr_fp64
 +#pragma OPENCL EXTENSION cl_khr_fp64 : enable
 +#endif
 +
 +#define BODY max.inc
 +#include clc/math/gentype.inc
 diff --git a/generic/lib/math/max.inc b/generic/lib/math/max.inc
 new file mode 100644
 index 000..37409fc
 --- /dev/null
 +++ b/generic/lib/math/max.inc
 @@ -0,0 +1,3 @@
 +_CLC_OVERLOAD _CLC_DEF GENTYPE max(GENTYPE a, GENTYPE b) {
 +  return (a  b ? a : b);
 +}
 -- 
 1.7.10.4
 
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] i965: Don't validate IR trees on non-debug builds.

2013-03-15 Thread Eric Anholt
This was taking 1.3% of CPU on TF2's load time.
---
 src/mesa/drivers/dri/i965/brw_shader.cpp |2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
b/src/mesa/drivers/dri/i965/brw_shader.cpp
index e4392bd..59e68d8 100644
--- a/src/mesa/drivers/dri/i965/brw_shader.cpp
+++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
@@ -232,7 +232,9 @@ brw_link_shader(struct gl_context *ctx, struct 
gl_shader_program *shProg)
 }
   }
 
+#ifdef DEBUG
   validate_ir_tree(shader-ir);
+#endif
 
   reparent_ir(shader-ir, shader-ir);
   ralloc_free(mem_ctx);
-- 
1.7.10.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] mesa: Don't validate IR trees on non-debug builds.

2013-03-15 Thread Eric Anholt
This was taking 5% of CPU on TF2's load time.
---
 src/mesa/program/ir_to_mesa.cpp |4 
 1 file changed, 4 insertions(+)

diff --git a/src/mesa/program/ir_to_mesa.cpp b/src/mesa/program/ir_to_mesa.cpp
index 2cb5f02..ae9c0cd 100644
--- a/src/mesa/program/ir_to_mesa.cpp
+++ b/src/mesa/program/ir_to_mesa.cpp
@@ -3114,7 +3114,9 @@ _mesa_glsl_compile_shader(struct gl_context *ctx, struct 
gl_shader *shader)
   _mesa_ast_to_hir(shader-ir, state);
 
if (!state-error  !shader-ir-is_empty()) {
+#ifdef DEBUG
   validate_ir_tree(shader-ir);
+#endif
 
   /* Do some optimization at compile time to reduce shader IR size
* and reduce later work if the same shader is linked multiple times
@@ -3122,7 +3124,9 @@ _mesa_glsl_compile_shader(struct gl_context *ctx, struct 
gl_shader *shader)
   while (do_common_optimization(shader-ir, false, false, 32))
 ;
 
+#ifdef DEBUG
   validate_ir_tree(shader-ir);
+#endif
}
 
shader-symbols = state-symbols;
-- 
1.7.10.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] libclc: Improve libclc handling of built-in functions

2013-03-15 Thread Tom Stellard
On Thu, Mar 14, 2013 at 10:01:00PM -0500, Aaron Watry wrote:
 This series depends on the one-off patch I just sent to add max().
 
 1) Fix the broken abs_diff integer built-in.
 2) Add clamp for both integer and floating types in a new shared/ dir in order
to reduce code duplication and improve maintainability.
 3) Move the max() function into the shared/ directory. 

This series looks good and actually negates my comment on the max patch.
I think we should make sure we have piglit tests that cover the NAN
and INF cases, because I'm not sure we handle those correctly in the
backend.

For the series:
Reviewed-by: Tom Stellard thomas.stell...@amd.com

 
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] ff_fragment_shader: Don't do unnecessary (and dangerous) uniform setup.

2013-03-15 Thread Paul Berry
Previously, right after calling _mesa_glsl_link_shader(), the fixed
function fragment shader code made several calls with the ostensible
purpose of setting up uniforms for the fragment shader it just
created.

These calls are unnecessary, since _mesa_glsl_link_shader() calls
driver-LinkShader(), which takes care of calling these functions (or
their equivalent).  Also, they are dangerous to call after
_mesa_glsl_link_shader() has returned, because on back-ends such as
i965 which do precompilation, _mesa_glsl_link_shader() may have
already cached pointers to the existing uniform structures; attempting
to set up the uniforms again invalidates those cached pointers.

It was only by sheer coincidence that this wasn't manifesting itself
as a bug.  It turns out that i965's precompile mechanism was always
setting bit 0 of brw_wm_prog_key::proj_attrib_mask to 0 for fixed
function fragment shaders, but during normal usage this bit usually
gets set to 1.  As a result, the precompiled shader (with its invalid
uniform pointers) was not being used.

I'm about to introduce some changes that cause bit 0 of
proj_attrib_mask to be set consistently between precompilation and
normal usage, so to avoid regressions I need to get rid of the
dangerous duplicate uniform setup code first.
---
 src/mesa/main/ff_fragment_shader.cpp | 16 
 1 file changed, 16 deletions(-)

diff --git a/src/mesa/main/ff_fragment_shader.cpp 
b/src/mesa/main/ff_fragment_shader.cpp
index 186988b..01a4542 100644
--- a/src/mesa/main/ff_fragment_shader.cpp
+++ b/src/mesa/main/ff_fragment_shader.cpp
@@ -1350,22 +1350,6 @@ create_new_program(struct gl_context *ctx, struct 
state_key *key)
 
_mesa_glsl_link_shader(ctx, p.shader_program);
 
-   /* Set the sampler uniforms, and relink to get them into the linked
-* program.
-*/
-   struct gl_shader *const fs =
-  p.shader_program-_LinkedShaders[MESA_SHADER_FRAGMENT];
-   struct gl_program *const fp = fs-Program;
-
-   _mesa_generate_parameters_list_for_uniforms(p.shader_program, fs,
-  fp-Parameters);
-
-   _mesa_associate_uniform_storage(ctx, p.shader_program, fp-Parameters);
-
-   _mesa_update_shader_textures_used(p.shader_program, fp);
-   if (ctx-Driver.SamplerUniformChange)
-  ctx-Driver.SamplerUniformChange(ctx, fp-Target, fp);
-
if (!p.shader_program-LinkStatus)
   _mesa_problem(ctx, Failed to link fixed function fragment shader: %s\n,
p.shader_program-InfoLog);
-- 
1.8.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] i965/fs: Avoid unnecessary recompiles due to POS bit of proj_attrib_mask.

2013-03-15 Thread Paul Berry
Previous to this patch, when using fixed function fragment shading,
bit VARYING_BIT_POS of brw_wm_prog_key::proj_attrib_mask was being set
differently during precompiles and normal usage.  During precompiles
it was being set only if the fragment shader reads from window
position (which it never does), so it was always being set to 0.
During normal usage it was being set if the vertex shader writes to
all 4 components of gl_Position (which it usually does), so it was
usually being set to 1.  As a result, we were almost always doing an
extra recompile for the fixed function fragment shader.

The recompile was totally unnecessary, though, because
brw_wm_prog_key::proj_attrib_mask is only consulted for
fs_visitor::emit_general_interpolation(), which isn't used for
VARYING_SLOT_POS.

This patch avoids the unnecessary recompile by always setting bit
VARYING_BIT_POS of brw_wm_prog_key::proj_attrib_mask to 1.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 6 ++
 src/mesa/drivers/dri/i965/brw_wm.c   | 8 ++--
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index d0f5fea..5a5bfeb 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -2987,6 +2987,12 @@ brw_fs_precompile(struct gl_context *ctx, struct 
gl_shader_program *prog)
 
if (prog-Name != 0)
   key.proj_attrib_mask = ~(GLbitfield64) 0;
+   else {
+  /* Bit VARYING_BIT_POS of key.proj_attrib_mask is never used, so to
+   * avoid unnecessary recompiles, always set it to 1.
+   */
+  key.proj_attrib_mask |= VARYING_BIT_POS;
+   }
 
if (intel-gen  6)
   key.vp_outputs_written |= BITFIELD64_BIT(VARYING_SLOT_POS);
diff --git a/src/mesa/drivers/dri/i965/brw_wm.c 
b/src/mesa/drivers/dri/i965/brw_wm.c
index 39cbbb7..bec8d85 100644
--- a/src/mesa/drivers/dri/i965/brw_wm.c
+++ b/src/mesa/drivers/dri/i965/brw_wm.c
@@ -429,8 +429,12 @@ static void brw_wm_populate_key( struct brw_context *brw,
 */
if (ctx-Shader.CurrentFragmentProgram)
   key-proj_attrib_mask = ~(GLbitfield64) 0;
-   else
-  key-proj_attrib_mask = brw-wm.input_size_masks[4-1];
+   else {
+  /* Bit VARYING_BIT_POS of key.proj_attrib_mask is never used, so to
+   * avoid unnecessary recompiles, always set it to 1.
+   */
+  key-proj_attrib_mask = brw-wm.input_size_masks[4-1] | VARYING_BIT_POS;
+   }
 
/* _NEW_LIGHT */
key-flat_shade = (ctx-Light.ShadeModel == GL_FLAT);
-- 
1.8.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] r600g: fall back to blitter for compressed textures

2013-03-15 Thread alexdeucher
From: Alex Deucher alexander.deuc...@amd.com

The hw can only access compressed textures as tiled not
linear so we need to do format tricks to handle them
properly.  The blitter code already handles this so
just fallback to the blitter for compressed textures.

Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=60802

Note: this is a candidate for the 9.1 branch.

Signed-off-by: Alex Deucher alexander.deuc...@amd.com
---
 src/gallium/drivers/r600/evergreen_state.c |9 +
 src/gallium/drivers/r600/r600_state.c  |9 +
 2 files changed, 18 insertions(+), 0 deletions(-)

diff --git a/src/gallium/drivers/r600/evergreen_state.c 
b/src/gallium/drivers/r600/evergreen_state.c
index 2bdefb0..4387c86 100644
--- a/src/gallium/drivers/r600/evergreen_state.c
+++ b/src/gallium/drivers/r600/evergreen_state.c
@@ -3674,6 +3674,15 @@ boolean evergreen_dma_blit(struct pipe_context *ctx,
return FALSE;
}
 
+   /* HW can only handle tiled compressed textures.
+* Need to do format tricks in blitter code to handle them
+* properly so bail here and let the blitter code handle it.
+*/
+   if (src_mode != dst_mode) {
+   if (util_format_is_compressed(src-format))
+   return FALSE;
+   }
+
if (src_mode == dst_mode) {
uint64_t dst_offset, src_offset;
/* simple dma blit would do NOTE code here assume :
diff --git a/src/gallium/drivers/r600/r600_state.c 
b/src/gallium/drivers/r600/r600_state.c
index 846c159..8929d6e 100644
--- a/src/gallium/drivers/r600/r600_state.c
+++ b/src/gallium/drivers/r600/r600_state.c
@@ -3113,6 +3113,15 @@ boolean r600_dma_blit(struct pipe_context *ctx,
return FALSE;
}
 
+   /* HW can only handle tiled compressed textures.
+* Need to do format tricks in blitter code to handle them
+* properly so bail here and let the blitter code handle it.
+*/
+   if (src_mode != dst_mode) {
+   if (util_format_is_compressed(src-format))
+   return FALSE;
+   }
+
if (src_mode == dst_mode) {
uint64_t dst_offset, src_offset, size;
 
-- 
1.7.7.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] r600g: properly set non_disp tiling mode for DMA

2013-03-15 Thread alexdeucher
From: Alex Deucher alexander.deuc...@amd.com

Needs to be set just like other blocks.

Signed-off-by: Alex Deucher alexander.deuc...@amd.com
---
 src/gallium/drivers/r600/evergreen_state.c |   11 +--
 1 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/r600/evergreen_state.c 
b/src/gallium/drivers/r600/evergreen_state.c
index 4387c86..bef3577 100644
--- a/src/gallium/drivers/r600/evergreen_state.c
+++ b/src/gallium/drivers/r600/evergreen_state.c
@@ -3528,7 +3528,7 @@ static void evergreen_dma_copy_tile(struct r600_context 
*rctx,
struct r600_texture *rdst = (struct r600_texture*)dst;
unsigned array_mode, lbpp, pitch_tile_max, slice_tile_max, size;
unsigned ncopy, height, cheight, detile, i, x, y, z, src_mode, dst_mode;
-   unsigned sub_cmd, bank_h, bank_w, mt_aspect, nbanks, tile_split;
+   unsigned sub_cmd, bank_h, bank_w, mt_aspect, nbanks, tile_split, 
non_disp_tiling = 0;
uint64_t base, addr;
 
/* make sure that the dma ring is only one active */
@@ -3541,6 +3541,13 @@ static void evergreen_dma_copy_tile(struct r600_context 
*rctx,
dst_mode = dst_mode == RADEON_SURF_MODE_LINEAR_ALIGNED ? 
RADEON_SURF_MODE_LINEAR : dst_mode;
assert(dst_mode != src_mode);
 
+   if (util_format_has_depth(util_format_description(src-format)))
+   non_disp_tiling = 1;
+   if (rctx-chip_class == CAYMAN) {
+   if (util_format_get_blocksize(src-format) = 16)
+   non_disp_tiling = 1;
+   }
+
y = 0;
sub_cmd = 0x8;
lbpp = util_logbase2(bpp);
@@ -3620,7 +3627,7 @@ static void evergreen_dma_copy_tile(struct r600_context 
*rctx,
cs-buf[cs-cdw++] = (pitch_tile_max  0) | ((height - 1)  
16);
cs-buf[cs-cdw++] = (slice_tile_max  0);
cs-buf[cs-cdw++] = (x  0) | (z  18);
-   cs-buf[cs-cdw++] = (y  0) | (tile_split  21) | (nbanks  
25);
+   cs-buf[cs-cdw++] = (y  0) | (tile_split  21) | (nbanks  
25) | (non_disp_tiling  28);
cs-buf[cs-cdw++] = addr  0xfffc;
cs-buf[cs-cdw++] = (addr  32UL)  0xff;
copy_height -= cheight;
-- 
1.7.7.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] mesa: Don't validate IR trees on non-debug builds.

2013-03-15 Thread Ian Romanick

On 03/15/2013 12:04 PM, Eric Anholt wrote:

This was taking 5% of CPU on TF2's load time.


Crap... I thought we already only did this in debug mode.  The series is

Reviewed-by: Ian Romanick ian.d.roman...@intel.com


---
  src/mesa/program/ir_to_mesa.cpp |4 
  1 file changed, 4 insertions(+)

diff --git a/src/mesa/program/ir_to_mesa.cpp b/src/mesa/program/ir_to_mesa.cpp
index 2cb5f02..ae9c0cd 100644
--- a/src/mesa/program/ir_to_mesa.cpp
+++ b/src/mesa/program/ir_to_mesa.cpp
@@ -3114,7 +3114,9 @@ _mesa_glsl_compile_shader(struct gl_context *ctx, struct 
gl_shader *shader)
_mesa_ast_to_hir(shader-ir, state);

 if (!state-error  !shader-ir-is_empty()) {
+#ifdef DEBUG
validate_ir_tree(shader-ir);
+#endif

/* Do some optimization at compile time to reduce shader IR size
 * and reduce later work if the same shader is linked multiple times
@@ -3122,7 +3124,9 @@ _mesa_glsl_compile_shader(struct gl_context *ctx, struct 
gl_shader *shader)
while (do_common_optimization(shader-ir, false, false, 32))
 ;

+#ifdef DEBUG
validate_ir_tree(shader-ir);
+#endif
 }

 shader-symbols = state-symbols;



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 62357] llvmpipe: Fragment Shader with return in main causes back output

2013-03-15 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=62357

--- Comment #10 from Roland Scheidegger srol...@vmware.com ---
Hmm I just noticed this doesn't actually work. Instead of not executing the
code after the conditional, the result will now always be as if the condition
passed, so the original testcase now fails (and the IR clearly indicates llvm
dropped all the code for the if condition (that is the comparison instructions
are still there but that's just because the results of that are stored back to
the register file).
I can't quite see though why if this doesn't work when happening in main, how
could it possibly work in a subroutine. I think there's something wrong with
handling condition masks around function calls/rets. Can't quite see yet how
this is all supposed to work together...

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallivm: fix returning unconditionally from main on TGSI_OPCODE_RET

2013-03-15 Thread Roland Scheidegger
Ok forget about this it still doesn't work correctly. Need to figure out
how all the mask business fits together.

Roland

Am 15.03.2013 19:34, schrieb srol...@vmware.com:
 From: Roland Scheidegger srol...@vmware.com
 
 If we're in some conditional we must not return, or the code after
 the condition is never executed.
 (Probably the same for loops.)
 This fixes https://bugs.freedesktop.org/show_bug.cgi?id=62357.
 
 Note: This is a candidate for the stable branches.
 ---
  src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c |4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)
 
 diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c 
 b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
 index 0dc26b5..b5f0ace 100644
 --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
 +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
 @@ -348,7 +348,9 @@ static void lp_exec_mask_ret(struct lp_exec_mask *mask, 
 int *pc)
 LLVMBuilderRef builder = mask-bld-gallivm-builder;
 LLVMValueRef exec_mask;
  
 -   if (mask-call_stack_size == 0) {
 +   if (mask-call_stack_size == 0 
 +   mask-cond_stack_size == 0 
 +   mask-loop_stack_size == 0) {
/* returning from main() */
*pc = -1;
return;
 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH V2] mesa: Fix FB blitting in case of zero size src or dst rect

2013-03-15 Thread Ian Romanick

On 03/08/2013 10:14 AM, Anuj Phogat wrote:

Framebuffer blitting operation should be skipped if any of the
dimensions (width/height) of src/dst rect is zero.

V2: Move the dimension check after error checking in _mesa_BlitFramebuffer.

Fixes: fbblit(negative.nullblit.zeroSize) in Intel oglconform
https://bugs.freedesktop.org/show_bug.cgi?id=59495


Reference the bug below as...



Note: Candidate for all the stable branches.

Signed-off-by: Anuj Phogat anuj.pho...@gmail.com


Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59495
Reviewed-by: Ian Romanick ian.d.roman...@intel.com


---
  src/mesa/main/fbobject.c | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c
index d6acc58..0126e29 100644
--- a/src/mesa/main/fbobject.c
+++ b/src/mesa/main/fbobject.c
@@ -3190,7 +3190,9 @@ _mesa_BlitFramebuffer(GLint srcX0, GLint srcY0, GLint 
srcX1, GLint srcY1,
}
 }

-   if (!mask) {
+   if (!mask ||
+   (srcX1 - srcX0) == 0 || (srcY1 - srcY0) == 0 ||
+   (dstX1 - dstX0) == 0 || (dstY1 - dstY0) == 0) {
return;
 }




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 58718] Crash in src_register() during glClear() call

2013-03-15 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=58718

José Fonseca jfons...@vmware.com changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |NOTOURBUG

--- Comment #16 from José Fonseca jfons...@vmware.com ---
Given that this works with MSVS 2012 I see no point in trying to workaround
with older versions, given it's so difficult.

I've crossported the MSVS 2012 build fixes to 9.1 stable branch.

So I'm marking this as solved now.

Thanks.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] gallium: add TGSI_SEMANTIC_TEXCOORD,PCOORD v3

2013-03-15 Thread Christoph Bumiller
This makes it possible to identify gl_TexCoord and gl_PointCoord
for drivers where sprite coordinate replacement is restricted.

The new PIPE_CAP_TGSI_TEXCOORD decides whether these varyings
should be hidden behind the GENERIC semantic or not.

With this patch only nvc0 and nv30 will request that they be used.

v2: introduce a CAP so other drivers don't have to bother with
the new semantic

v3: adapt to introduction gl_varying_slot enum
---
 src/gallium/auxiliary/draw/draw_pipe_wide_point.c |   46 +
 src/gallium/auxiliary/tgsi/tgsi_dump.c|1 +
 src/gallium/auxiliary/tgsi/tgsi_strings.c |4 +-
 src/gallium/docs/source/cso/rasterizer.rst|5 ++
 src/gallium/docs/source/screen.rst|8 
 src/gallium/docs/source/tgsi.rst  |   29 +
 src/gallium/drivers/freedreno/freedreno_screen.c  |2 +
 src/gallium/drivers/i915/i915_screen.c|2 +
 src/gallium/drivers/llvmpipe/lp_screen.c  |1 +
 src/gallium/drivers/nv30/nv30_screen.c|1 +
 src/gallium/drivers/nv30/nvfx_fragprog.c  |   42 ++-
 src/gallium/drivers/nv30/nvfx_vertprog.c  |7 +++-
 src/gallium/drivers/nv50/codegen/nv50_ir_driver.h |2 -
 src/gallium/drivers/nv50/nv50_screen.c|1 +
 src/gallium/drivers/nv50/nv50_surface.c   |5 +-
 src/gallium/drivers/nvc0/nvc0_program.c   |   37 +---
 src/gallium/drivers/nvc0/nvc0_screen.c|1 +
 src/gallium/drivers/r300/r300_screen.c|2 +
 src/gallium/drivers/r600/r600_pipe.c  |2 +
 src/gallium/drivers/radeonsi/radeonsi_pipe.c  |2 +
 src/gallium/drivers/softpipe/sp_screen.c  |2 +
 src/gallium/drivers/svga/svga_screen.c|2 +
 src/gallium/include/pipe/p_defines.h  |3 +-
 src/gallium/include/pipe/p_shader_tokens.h|4 +-
 src/gallium/include/pipe/p_state.h|2 +-
 src/mesa/state_tracker/st_context.c   |3 +
 src/mesa/state_tracker/st_context.h   |2 +
 src/mesa/state_tracker/st_program.c   |   45 +++-
 28 files changed, 171 insertions(+), 92 deletions(-)

diff --git a/src/gallium/auxiliary/draw/draw_pipe_wide_point.c 
b/src/gallium/auxiliary/draw/draw_pipe_wide_point.c
index 8e0a117..0d3fee4 100644
--- a/src/gallium/auxiliary/draw/draw_pipe_wide_point.c
+++ b/src/gallium/auxiliary/draw/draw_pipe_wide_point.c
@@ -52,6 +52,7 @@
  */
 
 
+#include pipe/p_screen.h
 #include pipe/p_context.h
 #include util/u_math.h
 #include util/u_memory.h
@@ -74,6 +75,9 @@ struct widepoint_stage {
uint num_texcoord_gen;
uint texcoord_gen_slot[PIPE_MAX_SHADER_OUTPUTS];
 
+   /* TGSI_SEMANTIC to which sprite_coord_enable applies */
+   unsigned sprite_coord_semantic;
+
int psize_slot;
 };
 
@@ -233,28 +237,29 @@ widepoint_first_point(struct draw_stage *stage,
 
   wide-num_texcoord_gen = 0;
 
-  /* Loop over fragment shader inputs looking for generic inputs
+  /* Loop over fragment shader inputs looking for the PCOORD input or 
inputs
* for which bit 'k' in sprite_coord_enable is set.
*/
   for (i = 0; i  fs-info.num_inputs; i++) {
- if (fs-info.input_semantic_name[i] == TGSI_SEMANTIC_GENERIC) {
-const int generic_index = fs-info.input_semantic_index[i];
-/* Note that sprite_coord enable is a bitfield of
- * PIPE_MAX_SHADER_OUTPUTS bits.
- */
-if (generic_index  PIPE_MAX_SHADER_OUTPUTS 
-(rast-sprite_coord_enable  (1  generic_index))) {
-   /* OK, this generic attribute needs to be replaced with a
-* texcoord (see above).
-*/
-   int slot = draw_alloc_extra_vertex_attrib(draw,
- TGSI_SEMANTIC_GENERIC,
- generic_index);
-
-   /* add this slot to the texcoord-gen list */
-   wide-texcoord_gen_slot[wide-num_texcoord_gen++] = slot;
-}
+ int slot;
+ const unsigned sn = fs-info.input_semantic_name[i];
+ const unsigned si = fs-info.input_semantic_index[i];
+
+ if (sn == wide-sprite_coord_semantic) {
+/* Note that sprite_coord_enable is a bitfield of 32 bits. */
+if (si = 32 || !(rast-sprite_coord_enable  (1  si)))
+   continue;
+ } else if (sn != TGSI_SEMANTIC_PCOORD) {
+continue;
  }
+
+ /* OK, this generic attribute needs to be replaced with a
+  * sprite coord (see above).
+  */
+ slot = draw_alloc_extra_vertex_attrib(draw, sn, si);
+
+ /* add this slot to the texcoord-gen list */
+ wide-texcoord_gen_slot[wide-num_texcoord_gen++] = slot;
   }
}
 
@@ -326,6 +331,11 @@ struct 

[Mesa-dev] [PATCH] i965: Don't use texture swizzling to force alpha to 1.0 if unnecessary.

2013-03-15 Thread Kenneth Graunke
Commit 33599433c7 began setting the texture swizzle mode to XYZ1 for
RED, RG, and RGB textures in order to force alpha to 1.0 in case we
actually stored the texture as RGBA.

This had a unforseen performance implication: the shader precompile
assumes that the texture swizzle mode will be XYZW for non-shadow
sampler types.  By setting it to XYZ1, this means every shader used with
a RED, RG, or RGB texture has to be recompiled.  This is a very common
case.

Unfortunately, there's no way to improve the precompile, since RGBA
textures still need XYZW, and there's no way to know by looking at
the shader source what texture formats might be used.

However, we only need to smash alpha to 1.0 if the texture's memory
format actually has alpha bits.  If not, the sampler already returns 1.0
for us without any special swizzling.  XRGB, for example, is a very
common case where this occurs.

This partially fixes a performance regression since commit 33599433c7.
More work is required to fully fix it in all cases.  This at least helps
Warsow.

NOTE: This is a candidate for the 9.1 branch.

Cc: Carl Worth cwo...@cworth.org
Cc: Eric Anholt e...@anholt.net
Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index 0cb4b2d..771655d 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -773,7 +773,8 @@ brw_get_texture_swizzle(const struct gl_context *ctx,
case GL_RED:
case GL_RG:
case GL_RGB:
-  swizzles[3] = SWIZZLE_ONE;
+  if (_mesa_get_format_bits(img-TexFormat, GL_ALPHA_BITS)  0)
+ swizzles[3] = SWIZZLE_ONE;
   break;
}
 
-- 
1.8.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] r600g: fall back to blitter for compressed textures on cayman (v2)

2013-03-15 Thread alexdeucher
From: Alex Deucher alexander.deuc...@amd.com

The DMA block seems to have alignment issues with large
block sizes.  Use the blitter for these surfaces.

v2: cayman/TN only

Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=60802

Note: this is a candidate for the 9.1 branch.

Signed-off-by: Alex Deucher alexander.deuc...@amd.com
---
 src/gallium/drivers/r600/evergreen_state.c |9 +
 1 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/src/gallium/drivers/r600/evergreen_state.c 
b/src/gallium/drivers/r600/evergreen_state.c
index 2bdefb0..b40ed01 100644
--- a/src/gallium/drivers/r600/evergreen_state.c
+++ b/src/gallium/drivers/r600/evergreen_state.c
@@ -3674,6 +3674,15 @@ boolean evergreen_dma_blit(struct pipe_context *ctx,
return FALSE;
}
 
+   /* The DMA block on cayman seems to have alignment issues
+* with large block sizes.  Needs more investigation.
+*/
+   if ((rctx-chip_class == CAYMAN) 
+   (src_mode != dst_mode) 
+   (util_format_get_blocksize(src-format) = 16)) {
+   return FALSE;
+   }
+
if (src_mode == dst_mode) {
uint64_t dst_offset, src_offset;
/* simple dma blit would do NOTE code here assume :
-- 
1.7.7.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] r600g: properly set non_disp tiling mode for DMA (v2)

2013-03-15 Thread alexdeucher
From: Alex Deucher alexander.deuc...@amd.com

Needs to be set for depth, stencil, and fmask just
like other blocks.

v2: drop additional cayman bits for now

Signed-off-by: Alex Deucher alexander.deuc...@amd.com
---
 src/gallium/drivers/r600/evergreen_state.c |8 ++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/r600/evergreen_state.c 
b/src/gallium/drivers/r600/evergreen_state.c
index b40ed01..387a0d7 100644
--- a/src/gallium/drivers/r600/evergreen_state.c
+++ b/src/gallium/drivers/r600/evergreen_state.c
@@ -3528,7 +3528,7 @@ static void evergreen_dma_copy_tile(struct r600_context 
*rctx,
struct r600_texture *rdst = (struct r600_texture*)dst;
unsigned array_mode, lbpp, pitch_tile_max, slice_tile_max, size;
unsigned ncopy, height, cheight, detile, i, x, y, z, src_mode, dst_mode;
-   unsigned sub_cmd, bank_h, bank_w, mt_aspect, nbanks, tile_split;
+   unsigned sub_cmd, bank_h, bank_w, mt_aspect, nbanks, tile_split, 
non_disp_tiling = 0;
uint64_t base, addr;
 
/* make sure that the dma ring is only one active */
@@ -3541,6 +3541,10 @@ static void evergreen_dma_copy_tile(struct r600_context 
*rctx,
dst_mode = dst_mode == RADEON_SURF_MODE_LINEAR_ALIGNED ? 
RADEON_SURF_MODE_LINEAR : dst_mode;
assert(dst_mode != src_mode);
 
+   /* non_disp_tiling bit needs to be set for depth, stencil, and fmask 
surfaces */
+   if (util_format_has_depth(util_format_description(src-format)))
+   non_disp_tiling = 1;
+
y = 0;
sub_cmd = 0x8;
lbpp = util_logbase2(bpp);
@@ -3620,7 +3624,7 @@ static void evergreen_dma_copy_tile(struct r600_context 
*rctx,
cs-buf[cs-cdw++] = (pitch_tile_max  0) | ((height - 1)  
16);
cs-buf[cs-cdw++] = (slice_tile_max  0);
cs-buf[cs-cdw++] = (x  0) | (z  18);
-   cs-buf[cs-cdw++] = (y  0) | (tile_split  21) | (nbanks  
25);
+   cs-buf[cs-cdw++] = (y  0) | (tile_split  21) | (nbanks  
25) | (non_disp_tiling  28);
cs-buf[cs-cdw++] = addr  0xfffc;
cs-buf[cs-cdw++] = (addr  32UL)  0xff;
copy_height -= cheight;
-- 
1.7.7.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Don't use texture swizzling to force alpha to 1.0 if unnecessary.

2013-03-15 Thread Ian Romanick

On 03/15/2013 03:14 PM, Kenneth Graunke wrote:

Commit 33599433c7 began setting the texture swizzle mode to XYZ1 for
RED, RG, and RGB textures in order to force alpha to 1.0 in case we
actually stored the texture as RGBA.

This had a unforseen performance implication: the shader precompile
assumes that the texture swizzle mode will be XYZW for non-shadow
sampler types.  By setting it to XYZ1, this means every shader used with
a RED, RG, or RGB texture has to be recompiled.  This is a very common
case.

Unfortunately, there's no way to improve the precompile, since RGBA
textures still need XYZW, and there's no way to know by looking at
the shader source what texture formats might be used.


I suspect that in a lot of cases the other components of the texture 
aren't used by the shader.  Could we use that information somehow?  We 
ought to be able to track which components of each sampler variable are 
used.



However, we only need to smash alpha to 1.0 if the texture's memory
format actually has alpha bits.  If not, the sampler already returns 1.0
for us without any special swizzling.  XRGB, for example, is a very
common case where this occurs.

This partially fixes a performance regression since commit 33599433c7.
More work is required to fully fix it in all cases.  This at least helps
Warsow.

NOTE: This is a candidate for the 9.1 branch.

Cc: Carl Worth cwo...@cworth.org
Cc: Eric Anholt e...@anholt.net
Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
  src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index 0cb4b2d..771655d 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -773,7 +773,8 @@ brw_get_texture_swizzle(const struct gl_context *ctx,
 case GL_RED:
 case GL_RG:
 case GL_RGB:
-  swizzles[3] = SWIZZLE_ONE;
+  if (_mesa_get_format_bits(img-TexFormat, GL_ALPHA_BITS)  0)
+ swizzles[3] = SWIZZLE_ONE;
break;
 }




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] glxgears is faster but 3D render is so slow

2013-03-15 Thread jupiter
Hi Brian,

On 3/15/13, Brian Paul bri...@vmware.com wrote:
 On 03/15/2013 05:39 AM, jupiter wrote:
 Thanks Brian and Matt.

 On 3/15/13, Matt Turnermatts...@gmail.com  wrote:
 On Thu, Mar 14, 2013 at 6:29 AM, Brian Paulbri...@vmware.com  wrote:
 Hmm, I guess autoconf still has some unneeded dependencies on DRI when
 it's
 not needed.  You might try adding --with-gallium-drivers=swrast so that
 no
 DRI drivers are selected.

 Don't think so. He's just not setting --with-gallium-drivers= so he
 gets a radeon driver by default.

 I did try to setup --with-gallium-drivers=llvm, it was an error, the
 gallium drivers does not have llvm. I now changed to
 --with-gallium-drivers=swrast. Matt is right, I don't need libdrm
 anymore.

 What is the correct llvm for --with-gallium-drivers? I also seen that
 dri-core were also built, that could cause the problems.

 The package built on CentOS 6.2 32-bit machine now included
 lib/gallium, but the libGL.so and libGL.so.1 did not link to
 lib/gallium/libGL.so.1.5.0. After manually linking the lib/libGL.so
 and libGL.so.1 to lib/gallium/libGL.so.1.5.0, although the glxinfo
 OpenGL rendering string is now pointing to the  llvmpipe, but it seems
 broken the xlib driver. It stopped running my 3D application via VNC
 connection. Does the LLVMPIPE use any DRI?

 No.

 Or is it still xlib driver?

 llvmpipe uses Xlib only.


 As the VNC can only use xlib, anything bypass the xlib will break the
 VNC connection.

 Do other OpenGL apps run OK with llvmpipe or is it just Chimera that's
 not working?  How exactly is Chimera failing/broken?

Please see following benchmark for using both xlib and llvm:

(1) xlib driver

$ glxinfo | grep OpenGL renderer string
OpenGL renderer string: Mesa X11

glxgears 1500 FPS

glxspheres
15 frames/sec - 15 Mpixels/sec

(2) llvm driver

$ glxinfo | grep OpenGL renderer string
OpenGL renderer string: Gallium 0.4 on llvmpipe (LLVM 3.2, 128 bits)

glxgears: 600 FPS

glxspheres: 1 frames/sec - 1 Mpixels/sec

It is fair to say, if running llvm driver in my local machine (a
32-bit CentOS 6.2 without VNC connection), it was indeed faster than
the xlib driver.

Seems to me that the llvm driver broken the xlib VNC connection which
could be caused by either I haven't configure the llvm correctly, or
mesa llvm compile process may have bugs.

For your further examination, please see following configurations for
building each drivers:

(1) Compile xlib driver

${SOURCE}/${CONFIGURE} --prefix=${INSTALL} --enable-xlib-glx
--disable-dri --with-gallium-drivers=swrast


(2) Compile llvm driver

LLVM=/usr/local/libllvm/3.2

${SOURCE}/${CONFIGURE} --prefix=${INSTALL} --enable-xlib-glx
--disable-dri --enable-gallium-llvm --with-gallium-drivers=swrast
--with-llvm-shared-libs=${LLVM}/lib --with-llvm-prefix=${LLVM}

Manually change libGL.so and libGL.so.1 to link lib/gallium/libGL.so.1.5.0.

Thank you.

Kind regards,

Jupiter
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] mesa: Don't validate IR trees on non-debug builds.

2013-03-15 Thread Eric Anholt
Ian Romanick i...@freedesktop.org writes:

 On 03/15/2013 12:04 PM, Eric Anholt wrote:
 This was taking 5% of CPU on TF2's load time.

 Crap... I thought we already only did this in debug mode.  The series is

 Reviewed-by: Ian Romanick ian.d.roman...@intel.com

Turns out we'd only done it in two out of several places.


pgpVd36ncfw6M.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Don't use texture swizzling to force alpha to 1.0 if unnecessary.

2013-03-15 Thread Eric Anholt
Kenneth Graunke kenn...@whitecape.org writes:

 Commit 33599433c7 began setting the texture swizzle mode to XYZ1 for
 RED, RG, and RGB textures in order to force alpha to 1.0 in case we
 actually stored the texture as RGBA.

 This had a unforseen performance implication: the shader precompile
 assumes that the texture swizzle mode will be XYZW for non-shadow
 sampler types.  By setting it to XYZ1, this means every shader used with
 a RED, RG, or RGB texture has to be recompiled.  This is a very common
 case.

 Unfortunately, there's no way to improve the precompile, since RGBA
 textures still need XYZW, and there's no way to know by looking at
 the shader source what texture formats might be used.

 However, we only need to smash alpha to 1.0 if the texture's memory
 format actually has alpha bits.  If not, the sampler already returns 1.0
 for us without any special swizzling.  XRGB, for example, is a very
 common case where this occurs.

 This partially fixes a performance regression since commit 33599433c7.
 More work is required to fully fix it in all cases.  This at least helps
 Warsow.

Now that we have MESA_FORMAT_XBGR16161616_FLOAT and company, we could
potentially make this conditional just die by using those formats.


pgpzWJJvQCiwz.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] docs: start release notes file for 9.2

2013-03-15 Thread Andreas Boll
---
 docs/relnotes-9.2.html |   65 
 docs/relnotes.html |1 +
 2 files changed, 66 insertions(+)
 create mode 100644 docs/relnotes-9.2.html

diff --git a/docs/relnotes-9.2.html b/docs/relnotes-9.2.html
new file mode 100644
index 000..2bf9133
--- /dev/null
+++ b/docs/relnotes-9.2.html
@@ -0,0 +1,65 @@
+!DOCTYPE HTML PUBLIC -//W3C//DTD HTML 4.01 Transitional//EN 
http://www.w3.org/TR/html4/loose.dtd;
+html lang=en
+head
+  meta http-equiv=content-type content=text/html; charset=utf-8
+  titleMesa Release Notes/title
+  link rel=stylesheet type=text/css href=mesa.css
+/head
+body
+
+div class=header
+  h1The Mesa 3D Graphics Library/h1
+/div
+
+iframe src=contents.html/iframe
+div class=content
+
+h1Mesa 9.2 Release Notes / tbd/h1
+
+p
+Mesa 9.2 is a new development release.
+People who are concerned with stability and reliability should stick
+with a previous release or wait for Mesa 9.2.1.
+/p
+p
+Mesa 9.2 implements the OpenGL 3.1 API, but the version reported by
+glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
+glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
+Some drivers don't support all the features required in OpenGL 3.1.  OpenGL
+3.1 is strongonly/strong available if requested at context creation
+because GL_ARB_compatibility is not supported.
+/p
+
+
+h2MD5 checksums/h2
+pre
+tbd
+/pre
+
+
+h2New features/h2
+
+p
+Note: some of the new features are only available with certain drivers.
+/p
+
+ul
+liAdded new freedreno gallium driver/li
+liAdded new OSMesa gallium state tracker/li
+/ul
+
+
+h2Bug fixes/h2
+
+pTBD -- This list is likely incomplete./p
+
+
+h2Changes/h2
+
+ul
+liRemoved d3d1x state tracker (unused, unmaintained and broken)/li
+/ul
+
+/div
+/body
+/html
diff --git a/docs/relnotes.html b/docs/relnotes.html
index 2e11bc4..f617efa 100644
--- a/docs/relnotes.html
+++ b/docs/relnotes.html
@@ -21,6 +21,7 @@ The release notes summarize what's new or changed in each 
Mesa release.
 /p
 
 ul
+lia href=relnotes-9.2.html9.2 release notes/a
 lia href=relnotes-9.1.html9.1 release notes/a
 lia href=relnotes-9.0.3.html9.0.3 release notes/a
 lia href=relnotes-9.0.2.html9.0.2 release notes/a
-- 
1.7.10.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] mesa: Don't validate IR trees on non-debug builds.

2013-03-15 Thread Jordan Justen
On Fri, Mar 15, 2013 at 12:04 PM, Eric Anholt e...@anholt.net wrote:
 This was taking 5% of CPU on TF2's load time.

Seems like a candidate for stable.

-Jordan

 ---
  src/mesa/program/ir_to_mesa.cpp |4 
  1 file changed, 4 insertions(+)

 diff --git a/src/mesa/program/ir_to_mesa.cpp b/src/mesa/program/ir_to_mesa.cpp
 index 2cb5f02..ae9c0cd 100644
 --- a/src/mesa/program/ir_to_mesa.cpp
 +++ b/src/mesa/program/ir_to_mesa.cpp
 @@ -3114,7 +3114,9 @@ _mesa_glsl_compile_shader(struct gl_context *ctx, 
 struct gl_shader *shader)
_mesa_ast_to_hir(shader-ir, state);

 if (!state-error  !shader-ir-is_empty()) {
 +#ifdef DEBUG
validate_ir_tree(shader-ir);
 +#endif

/* Do some optimization at compile time to reduce shader IR size
 * and reduce later work if the same shader is linked multiple times
 @@ -3122,7 +3124,9 @@ _mesa_glsl_compile_shader(struct gl_context *ctx, 
 struct gl_shader *shader)
while (do_common_optimization(shader-ir, false, false, 32))
  ;

 +#ifdef DEBUG
validate_ir_tree(shader-ir);
 +#endif
 }

 shader-symbols = state-symbols;
 --
 1.7.10.4

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 62357] llvmpipe: Fragment Shader with return in main causes back output

2013-03-15 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=62357

--- Comment #11 from Roland Scheidegger srol...@vmware.com ---
Created attachment 76596
  -- https://bugs.freedesktop.org/attachment.cgi?id=76596action=edit
another testcase

Here's another test case. With this one you can actually see if both the early
exit and the code after the if are executed properly (as it outputs green on
odd pixels and pink on even ones).
(Requires glsl 1.30 though now.)
Succeeds with softpipe.
With unpatched llvmpipe (that one definitely can't work), it fails on the pink
pixels (as it never executes the code after the ret, hence it outputs
green/black pattern).
With the patch, it will fail on the green pixels (outputs a solid pink - this
is less obvious why...).

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] gallivm: fix return opcode handling in main function of a shader

2013-03-15 Thread sroland
From: Roland Scheidegger srol...@vmware.com

If we're in some conditional or loop we must not return, or the code
after the condition is never executed.
(v2): And, we also can't just continue as nothing happened, since the
mask update code would later check if we actually have a mask, so we
need to remember that there was a return in main where we didn't exit
(to illustrate this, a ret in a if clause would cause a mask update
which is still ok as we're in a conditional, but after the endif the
mask update code would drop the mask hence bringing execution back to
pixels which should have their execution mask set to zero by the ret).
Thanks to Christoph Bumiller for figuring this out.

This fixes https://bugs.freedesktop.org/show_bug.cgi?id=62357.

Note: This is a candidate for the stable branches.
---
 src/gallium/auxiliary/gallivm/lp_bld_tgsi.h |1 +
 src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c |   20 +---
 2 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h
index dac97c3..6e65e12 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h
@@ -243,6 +243,7 @@ struct lp_exec_mask {
struct lp_build_context *bld;
 
boolean has_mask;
+   boolean ret_in_main;
 
LLVMTypeRef int_vec_type;
 
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
index 0dc26b5..965255a 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
@@ -73,6 +73,7 @@ static void lp_exec_mask_init(struct lp_exec_mask *mask, 
struct lp_build_context
 
mask-bld = bld;
mask-has_mask = FALSE;
+   mask-ret_in_main = FALSE;
mask-cond_stack_size = 0;
mask-loop_stack_size = 0;
mask-call_stack_size = 0;
@@ -108,7 +109,7 @@ static void lp_exec_mask_update(struct lp_exec_mask *mask)
} else
   mask-exec_mask = mask-cond_mask;
 
-   if (mask-call_stack_size) {
+   if (mask-call_stack_size || mask-ret_in_main) {
   mask-exec_mask = LLVMBuildAnd(builder,
  mask-exec_mask,
  mask-ret_mask,
@@ -117,7 +118,8 @@ static void lp_exec_mask_update(struct lp_exec_mask *mask)
 
mask-has_mask = (mask-cond_stack_size  0 ||
  mask-loop_stack_size  0 ||
- mask-call_stack_size  0);
+ mask-call_stack_size  0 ||
+ mask-ret_in_main);
 }
 
 static void lp_exec_mask_cond_push(struct lp_exec_mask *mask,
@@ -348,11 +350,23 @@ static void lp_exec_mask_ret(struct lp_exec_mask *mask, 
int *pc)
LLVMBuilderRef builder = mask-bld-gallivm-builder;
LLVMValueRef exec_mask;
 
-   if (mask-call_stack_size == 0) {
+   if (mask-cond_stack_size == 0 
+   mask-loop_stack_size == 0 
+   mask-call_stack_size == 0) {
   /* returning from main() */
   *pc = -1;
   return;
}
+
+   if (mask-call_stack_size == 0) {
+  /*
+   * This requires special handling since we need to ensure
+   * we don't drop the mask even if we have no call stack
+   * (e.g. after a ret in a if clause after the endif)
+   */
+  mask-ret_in_main = TRUE;
+   }
+
exec_mask = LLVMBuildNot(builder,
 mask-exec_mask,
 ret);
-- 
1.7.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/7] Add support for ARB_texture_storage_multisample

2013-03-15 Thread Chris Forbes
This series adds support for ARB_texture_storage_multisample, which
adds two interesting bits of behavior for multisample textures:

- Immutable-format support, consistent with ARB_texture_storage
- [Get]TexParameter* support

This is admittedly not very useful by itself, but becomes more interesting
when ARB_texture_view is supported.

I have some matching piglits for this which I will send out shortly.

-- Chris

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/7] mesa: add support for immutable textures to teximagemultisample()

2013-03-15 Thread Chris Forbes
The new entrypoints will come later, but this adds the actual logic for
supporting immutable multisample textures:

- The immutability flag is set as desired.
- Attempting to modify an immutable multisample texture produces
  INVALID_OPERATION.

Note: The extension spec does not mention adding this behavior to
TexImage*Multisample, but it seems like the reasonable thing to do.

Signed-off-by: Chris Forbes chr...@ijw.co.nz
---
 src/mesa/main/teximage.c | 15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c
index bc755ae..0cd4beb 100644
--- a/src/mesa/main/teximage.c
+++ b/src/mesa/main/teximage.c
@@ -4194,7 +4194,8 @@ check_multisample_target(GLuint dims, GLenum target)
 static void
 teximagemultisample(GLuint dims, GLenum target, GLsizei samples,
 GLint internalformat, GLsizei width, GLsizei height,
-GLsizei depth, GLboolean fixedsamplelocations)
+GLsizei depth, GLboolean fixedsamplelocations,
+GLboolean immutable)
 {
struct gl_texture_object *texObj;
struct gl_texture_image *texImage;
@@ -4278,6 +4279,13 @@ teximagemultisample(GLuint dims, GLenum target, GLsizei 
samples,
  return;
   }
 
+  /* Check if texObj-Immutable is set */
+  if (texObj-Immutable) {
+ _mesa_error(ctx, GL_INVALID_OPERATION, 
glTexImage%uDMultisample(immutable),
+ dims);
+ return;
+  }
+
   ctx-Driver.FreeTextureImageBuffer(ctx, texImage);
 
   _mesa_init_teximage_fields(ctx, texImage,
@@ -4299,6 +4307,7 @@ teximagemultisample(GLuint dims, GLenum target, GLsizei 
samples,
  }
   }
 
+  texObj-Immutable = immutable;
   _mesa_update_fbo_texture(ctx, texObj, 0, 0);
}
 }
@@ -4309,7 +4318,7 @@ _mesa_TexImage2DMultisample(GLenum target, GLsizei 
samples,
 GLsizei height, GLboolean fixedsamplelocations)
 {
teximagemultisample(2, target, samples, internalformat,
- width, height, 1, fixedsamplelocations);
+ width, height, 1, fixedsamplelocations, GL_FALSE);
 }
 
 void GLAPIENTRY
@@ -4319,5 +4328,5 @@ _mesa_TexImage3DMultisample(GLenum target, GLsizei 
samples,
 GLboolean fixedsamplelocations)
 {
teximagemultisample(3, target, samples, internalformat,
- width, height, depth, fixedsamplelocations);
+ width, height, depth, fixedsamplelocations, GL_FALSE);
 }
-- 
1.8.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/7] glapi: add definition of ARB_texture_storage_multisample

2013-03-15 Thread Chris Forbes
Adds XML for the extension, dispatch_sanity enabling, and the two new
entrypoints. These are both implemented by calling the shared
teximagemultisample() with immutable=GL_TRUE.

Signed-off-by: Chris Forbes chr...@ijw.co.nz
---
 .../glapi/gen/ARB_texture_storage_multisample.xml  | 31 ++
 src/mapi/glapi/gen/gl_API.xml  |  4 +++
 src/mesa/main/tests/dispatch_sanity.cpp|  4 +--
 src/mesa/main/teximage.c   | 20 ++
 src/mesa/main/teximage.h   | 11 
 5 files changed, 68 insertions(+), 2 deletions(-)
 create mode 100644 src/mapi/glapi/gen/ARB_texture_storage_multisample.xml

diff --git a/src/mapi/glapi/gen/ARB_texture_storage_multisample.xml 
b/src/mapi/glapi/gen/ARB_texture_storage_multisample.xml
new file mode 100644
index 000..ebd8965
--- /dev/null
+++ b/src/mapi/glapi/gen/ARB_texture_storage_multisample.xml
@@ -0,0 +1,31 @@
+?xml version=1.0?
+!DOCTYPE OpenGLAPI SYSTEM gl_API.dtd
+
+!-- Note: no GLX protocol info yet. --
+
+OpenGLAPI
+
+category name=GL_ARB_texture_storage_multisample number=141
+
+   function name=TexStorage2DMultisample offset=assign
+  param name=target type=GLenum/
+  param name=samples type=GLsizei/
+  param name=internalformat type=GLint/
+  param name=width type=GLsizei/
+  param name=height type=GLsizei/
+  param name=fixedsamplelocations type=GLboolean/
+   /function
+
+   function name=TexStorage3DMultisample offset=assign
+  param name=target type=GLenum/
+  param name=samples type=GLsizei/
+  param name=internalformat type=GLint/
+  param name=width type=GLsizei/
+  param name=height type=GLsizei/
+  param name=depth type=GLsizei/
+  param name=fixedsamplelocations type=GLboolean/
+   /function
+
+/category
+
+/OpenGLAPI
diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml
index 75957dc..df95924 100644
--- a/src/mapi/glapi/gen/gl_API.xml
+++ b/src/mapi/glapi/gen/gl_API.xml
@@ -8321,6 +8321,10 @@
 
 xi:include href=ARB_texture_buffer_range.xml 
xmlns:xi=http://www.w3.org/2001/XInclude/
 
+!-- 140. GL_ARB_texture_query_levels --
+
+xi:include href=ARB_texture_storage_multisample.xml 
xmlns:xi=http://www.w3.org/2001/XInclude/
+
 !-- Non-ARB extensions sorted by extension number. --
 
 category name=GL_EXT_blend_color number=2
diff --git a/src/mesa/main/tests/dispatch_sanity.cpp 
b/src/mesa/main/tests/dispatch_sanity.cpp
index 3431ded..ffd83fe 100644
--- a/src/mesa/main/tests/dispatch_sanity.cpp
+++ b/src/mesa/main/tests/dispatch_sanity.cpp
@@ -895,8 +895,8 @@ const struct function gl_core_functions_possible[] = {
 // { glShaderStorageBlockBinding, 43, -1 },   // XXX: Add to xml
{ glTexBufferRange, 43, -1 },
 // { glTextureBufferRangeEXT, 43, -1 },   // XXX: Add to xml
-// { glTexStorage2DMultisample, 43, -1 }, // XXX: Add to xml
-// { glTexStorage3DMultisample, 43, -1 }, // XXX: Add to xml
+   { glTexStorage2DMultisample, 43, -1 },
+   { glTexStorage3DMultisample, 43, -1 },
 // { glTextureStorage2DMultisampleEXT, 43, -1 },  // XXX: Add to xml
 // { glTextureStorage3DMultisampleEXT, 43, -1 },  // XXX: Add to xml
 
diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c
index 0cd4beb..c499b90 100644
--- a/src/mesa/main/teximage.c
+++ b/src/mesa/main/teximage.c
@@ -4330,3 +4330,23 @@ _mesa_TexImage3DMultisample(GLenum target, GLsizei 
samples,
teximagemultisample(3, target, samples, internalformat,
  width, height, depth, fixedsamplelocations, GL_FALSE);
 }
+
+
+void GLAPIENTRY
+_mesa_TexStorage2DMultisample(GLenum target, GLsizei samples,
+  GLint internalformat, GLsizei width,
+  GLsizei height, GLboolean fixedsamplelocations)
+{
+   teximagemultisample(2, target, samples, internalformat,
+ width, height, 1, fixedsamplelocations, GL_TRUE);
+}
+
+void GLAPIENTRY
+_mesa_TexStorage3DMultisample(GLenum target, GLsizei samples,
+  GLint internalformat, GLsizei width,
+  GLsizei height, GLsizei depth,
+  GLboolean fixedsamplelocations)
+{
+   teximagemultisample(3, target, samples, internalformat,
+ width, height, depth, fixedsamplelocations, GL_TRUE);
+}
diff --git a/src/mesa/main/teximage.h b/src/mesa/main/teximage.h
index 744c47a..cedd933 100644
--- a/src/mesa/main/teximage.h
+++ b/src/mesa/main/teximage.h
@@ -305,6 +305,17 @@ _mesa_TexImage3DMultisample(GLenum target, GLsizei samples,
 GLsizei height, GLsizei depth,
 GLboolean fixedsamplelocations);
 
+extern void GLAPIENTRY
+_mesa_TexStorage2DMultisample(GLenum target, GLsizei samples,
+  GLint internalformat, GLsizei width,
+  GLsizei height, GLboolean fixedsamplelocations);
+
+extern void GLAPIENTRY

[Mesa-dev] [PATCH 3/7] mesa: add enable bit for ARB_texture_storage_multisample

2013-03-15 Thread Chris Forbes
---
 src/mesa/main/extensions.c | 1 +
 src/mesa/main/mtypes.h | 1 +
 2 files changed, 2 insertions(+)

diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c
index e90a296..004fc8e 100644
--- a/src/mesa/main/extensions.c
+++ b/src/mesa/main/extensions.c
@@ -148,6 +148,7 @@ static const struct extension extension_table[] = {
{ GL_ARB_texture_rgb10_a2ui,  o(ARB_texture_rgb10_a2ui),  
GL, 2009 },
{ GL_ARB_texture_rg,  o(ARB_texture_rg),  
GL, 2008 },
{ GL_ARB_texture_storage, o(ARB_texture_storage), 
GL, 2011 },
+   { GL_ARB_texture_storage_multisample, 
o(ARB_texture_storage_multisample), GL, 2012 },
{ GL_ARB_texture_swizzle, o(EXT_texture_swizzle), 
GL, 2008 },
{ GL_ARB_timer_query, o(ARB_timer_query), 
GL, 2010 },
{ GL_ARB_transform_feedback2, o(ARB_transform_feedback2), 
GL, 2010 },
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 4f09513..c45344c 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -3112,6 +3112,7 @@ struct gl_extensions
GLboolean ARB_texture_rg;
GLboolean ARB_texture_rgb10_a2ui;
GLboolean ARB_texture_storage;
+   GLboolean ARB_texture_storage_multisample;
GLboolean ARB_timer_query;
GLboolean ARB_transform_feedback2;
GLboolean ARB_transform_feedback3;
-- 
1.8.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/7] mesa: improve reported function name in Tex*Multisample

2013-03-15 Thread Chris Forbes
Now that there are 4 variants, just pass the function name into
teximagemultisample rather than reconstructing it.

Signed-off-by: Chris Forbes chr...@ijw.co.nz
---
 src/mesa/main/teximage.c | 30 ++
 1 file changed, 14 insertions(+), 16 deletions(-)

diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c
index c499b90..815547d 100644
--- a/src/mesa/main/teximage.c
+++ b/src/mesa/main/teximage.c
@@ -4195,7 +4195,7 @@ static void
 teximagemultisample(GLuint dims, GLenum target, GLsizei samples,
 GLint internalformat, GLsizei width, GLsizei height,
 GLsizei depth, GLboolean fixedsamplelocations,
-GLboolean immutable)
+GLboolean immutable, const char *func)
 {
struct gl_texture_object *texObj;
struct gl_texture_image *texImage;
@@ -4207,12 +4207,12 @@ teximagemultisample(GLuint dims, GLenum target, GLsizei 
samples,
 
if (!(ctx-Extensions.ARB_texture_multisample
_mesa_is_desktop_gl(ctx))) {
-  _mesa_error(ctx, GL_INVALID_OPERATION, glTexImage%uDMultisample, dims);
+  _mesa_error(ctx, GL_INVALID_OPERATION, %s(unsupported), func);
   return;
}
 
if (!check_multisample_target(dims, target)) {
-  _mesa_error(ctx, GL_INVALID_ENUM, glTexImage%uDMultisample(target), 
dims);
+  _mesa_error(ctx, GL_INVALID_ENUM, %s(target), func);
   return;
}
 
@@ -4222,16 +4222,15 @@ teximagemultisample(GLuint dims, GLenum target, GLsizei 
samples,
 
if (!is_renderable_texture_format(ctx, internalformat)) {
   _mesa_error(ctx, GL_INVALID_OPERATION,
-glTexImage%uDMultisample(internalformat=%s),
-dims,
-_mesa_lookup_enum_by_nr(internalformat));
+%s(internalformat=%s),
+func, _mesa_lookup_enum_by_nr(internalformat));
   return;
}
 
sample_count_error = _mesa_check_sample_count(ctx, target,
  internalformat, samples);
if (sample_count_error != GL_NO_ERROR) {
-  _mesa_error(ctx, sample_count_error, 
glTexImage%uDMultisample(samples), dims);
+  _mesa_error(ctx, sample_count_error, %s(samples), func);
   return;
}
 
@@ -4239,7 +4238,7 @@ teximagemultisample(GLuint dims, GLenum target, GLsizei 
samples,
texImage = _mesa_get_tex_image(ctx, texObj, 0, 0);
 
if (texImage == NULL) {
-  _mesa_error(ctx, GL_OUT_OF_MEMORY, glTexImage%uDMultisample(), dims);
+  _mesa_error(ctx, GL_OUT_OF_MEMORY, %s(), func);
   return;
}
 
@@ -4269,20 +4268,19 @@ teximagemultisample(GLuint dims, GLenum target, GLsizei 
samples,
else {
   if (!dimensionsOK) {
  _mesa_error(ctx, GL_INVALID_VALUE,
-   glTexImage%uDMultisample(invalid width or height), dims);
+   %s(invalid width or height), func);
  return;
   }
 
   if (!sizeOK) {
  _mesa_error(ctx, GL_OUT_OF_MEMORY,
-   glTexImage%uDMultisample(texture too large), dims);
+   %s(texture too large), func);
  return;
   }
 
   /* Check if texObj-Immutable is set */
   if (texObj-Immutable) {
- _mesa_error(ctx, GL_INVALID_OPERATION, 
glTexImage%uDMultisample(immutable),
- dims);
+ _mesa_error(ctx, GL_INVALID_OPERATION, %s(immutable), func);
  return;
   }
 
@@ -4318,7 +4316,7 @@ _mesa_TexImage2DMultisample(GLenum target, GLsizei 
samples,
 GLsizei height, GLboolean fixedsamplelocations)
 {
teximagemultisample(2, target, samples, internalformat,
- width, height, 1, fixedsamplelocations, GL_FALSE);
+ width, height, 1, fixedsamplelocations, GL_FALSE, 
glTexImage2DMultisample);
 }
 
 void GLAPIENTRY
@@ -4328,7 +4326,7 @@ _mesa_TexImage3DMultisample(GLenum target, GLsizei 
samples,
 GLboolean fixedsamplelocations)
 {
teximagemultisample(3, target, samples, internalformat,
- width, height, depth, fixedsamplelocations, GL_FALSE);
+ width, height, depth, fixedsamplelocations, GL_FALSE, 
glTexImage3DMultisample);
 }
 
 
@@ -4338,7 +4336,7 @@ _mesa_TexStorage2DMultisample(GLenum target, GLsizei 
samples,
   GLsizei height, GLboolean fixedsamplelocations)
 {
teximagemultisample(2, target, samples, internalformat,
- width, height, 1, fixedsamplelocations, GL_TRUE);
+ width, height, 1, fixedsamplelocations, GL_TRUE, 
glTexStorage2DMultisample);
 }
 
 void GLAPIENTRY
@@ -4348,5 +4346,5 @@ _mesa_TexStorage3DMultisample(GLenum target, GLsizei 
samples,
   GLboolean fixedsamplelocations)
 {
teximagemultisample(3, target, samples, internalformat,
- width, height, depth, fixedsamplelocations, GL_TRUE);
+ width, height, depth, fixedsamplelocations, GL_TRUE, 
glTexStorage3DMultisample);
 }
-- 
1.8.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org

[Mesa-dev] [PATCH 5/7] mesa: allow multisample texture targets in [Get]TexParameter*

2013-03-15 Thread Chris Forbes
ARB_texture_storage_multisample allows texture parameters to be
queried for TEXTURE_2D_MULTISAMPLE and TEXTURE_2D_MULTISAMPLE_ARRAY
targets.

Some parameters may also be set, with the following exceptions:

- TEXTURE_BASE_LEVEL may not be set to a nonzero value; generates
   INVALID_OPERATION

- any state which appears in the `per-sampler` state table may not
  be set; generates INVALID_OPERATION

Signed-off-by: Chris Forbes chr...@ijw.co.nz
---
 src/mesa/main/texparam.c | 92 +++-
 1 file changed, 91 insertions(+), 1 deletion(-)

diff --git a/src/mesa/main/texparam.c b/src/mesa/main/texparam.c
index 120845b..07a0eb0 100644
--- a/src/mesa/main/texparam.c
+++ b/src/mesa/main/texparam.c
@@ -175,6 +175,16 @@ get_texobj(struct gl_context *ctx, GLenum target, 
GLboolean get)
  return texUnit-CurrentTex[TEXTURE_CUBE_ARRAY_INDEX];
   }
   break;
+   case GL_TEXTURE_2D_MULTISAMPLE:
+  if (ctx-Extensions.ARB_texture_storage_multisample) {
+ return texUnit-CurrentTex[TEXTURE_2D_MULTISAMPLE_INDEX];
+  }
+  break;
+   case GL_TEXTURE_2D_MULTISAMPLE_ARRAY:
+  if (ctx-Extensions.ARB_texture_storage_multisample) {
+ return texUnit-CurrentTex[TEXTURE_2D_MULTISAMPLE_ARRAY_INDEX];
+  }
+  break;
default:
   ;
}
@@ -250,6 +260,20 @@ incomplete(struct gl_context *ctx, struct 
gl_texture_object *texObj)
 }
 
 
+static GLboolean
+target_allows_setting_sampler_parameters(GLenum target)
+{
+   switch (target) {
+   case GL_TEXTURE_2D_MULTISAMPLE:
+   case GL_TEXTURE_2D_MULTISAMPLE_ARRAY:
+  return GL_FALSE;
+
+   default:
+  return GL_TRUE;
+   }
+}
+
+
 /**
  * Set an integer-valued texture parameter
  * \return GL_TRUE if legal AND the value changed, GL_FALSE otherwise
@@ -261,6 +285,9 @@ set_tex_parameteri(struct gl_context *ctx,
 {
switch (pname) {
case GL_TEXTURE_MIN_FILTER:
+  if (!target_allows_setting_sampler_parameters(texObj-Target))
+ goto invalid_operation;
+
   if (texObj-Sampler.MinFilter == params[0])
  return GL_FALSE;
   switch (params[0]) {
@@ -286,6 +313,9 @@ set_tex_parameteri(struct gl_context *ctx,
   return GL_FALSE;
 
case GL_TEXTURE_MAG_FILTER:
+  if (!target_allows_setting_sampler_parameters(texObj-Target))
+ goto invalid_operation;
+
   if (texObj-Sampler.MagFilter == params[0])
  return GL_FALSE;
   switch (params[0]) {
@@ -300,6 +330,9 @@ set_tex_parameteri(struct gl_context *ctx,
   return GL_FALSE;
 
case GL_TEXTURE_WRAP_S:
+  if (!target_allows_setting_sampler_parameters(texObj-Target))
+ goto invalid_operation;
+
   if (texObj-Sampler.WrapS == params[0])
  return GL_FALSE;
   if (validate_texture_wrap_mode(ctx, texObj-Target, params[0])) {
@@ -310,6 +343,9 @@ set_tex_parameteri(struct gl_context *ctx,
   return GL_FALSE;
 
case GL_TEXTURE_WRAP_T:
+  if (!target_allows_setting_sampler_parameters(texObj-Target))
+ goto invalid_operation;
+
   if (texObj-Sampler.WrapT == params[0])
  return GL_FALSE;
   if (validate_texture_wrap_mode(ctx, texObj-Target, params[0])) {
@@ -320,6 +356,9 @@ set_tex_parameteri(struct gl_context *ctx,
   return GL_FALSE;
 
case GL_TEXTURE_WRAP_R:
+  if (!target_allows_setting_sampler_parameters(texObj-Target))
+ goto invalid_operation;
+
   if (texObj-Sampler.WrapR == params[0])
  return GL_FALSE;
   if (validate_texture_wrap_mode(ctx, texObj-Target, params[0])) {
@@ -335,6 +374,11 @@ set_tex_parameteri(struct gl_context *ctx,
 
   if (texObj-BaseLevel == params[0])
  return GL_FALSE;
+
+  if ((texObj-Target == GL_TEXTURE_2D_MULTISAMPLE ||
+   texObj-Target == GL_TEXTURE_2D_MULTISAMPLE_ARRAY)  params[0] != 
0)
+ goto invalid_operation;
+
   if (params[0]  0 ||
   (texObj-Target == GL_TEXTURE_RECTANGLE_ARB  params[0] != 0)) {
  _mesa_error(ctx, GL_INVALID_VALUE,
@@ -348,6 +392,11 @@ set_tex_parameteri(struct gl_context *ctx,
case GL_TEXTURE_MAX_LEVEL:
   if (texObj-MaxLevel == params[0])
  return GL_FALSE;
+
+  if ((texObj-Target == GL_TEXTURE_2D_MULTISAMPLE ||
+   texObj-Target == GL_TEXTURE_2D_MULTISAMPLE_ARRAY)  params[0] != 
0)
+ goto invalid_operation;
+
   if (params[0]  0 || texObj-Target == GL_TEXTURE_RECTANGLE_ARB) {
  _mesa_error(ctx, GL_INVALID_VALUE,
  glTexParameter(param=%d), params[0]);
@@ -373,6 +422,10 @@ set_tex_parameteri(struct gl_context *ctx,
case GL_TEXTURE_COMPARE_MODE_ARB:
   if ((_mesa_is_desktop_gl(ctx)  ctx-Extensions.ARB_shadow)
   || _mesa_is_gles3(ctx)) {
+
+ if (!target_allows_setting_sampler_parameters(texObj-Target))
+goto invalid_operation;
+
  if (texObj-Sampler.CompareMode == params[0])
 return GL_FALSE;
  if (params[0] == GL_NONE ||
@@ -388,6 +441,10 @@ 

[Mesa-dev] [PATCH 6/7] i965: enable ARB_texture_storage_multisample on Gen6+

2013-03-15 Thread Chris Forbes
This can be enabled everywhere that ARB_texture_multisample is
supported -- ARB_texture_storage is supported on everything.

Signed-off-by: Chris Forbes chr...@ijw.co.nz
---
 src/mesa/drivers/dri/intel/intel_extensions.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/drivers/dri/intel/intel_extensions.c 
b/src/mesa/drivers/dri/intel/intel_extensions.c
index 332fdd8..edac4d7 100755
--- a/src/mesa/drivers/dri/intel/intel_extensions.c
+++ b/src/mesa/drivers/dri/intel/intel_extensions.c
@@ -108,6 +108,7 @@ intelInitExtensions(struct gl_context *ctx)
   ctx-Extensions.OES_depth_texture_cube_map = true;
   ctx-Extensions.ARB_shading_language_packing = true;
   ctx-Extensions.ARB_texture_multisample = true;
+  ctx-Extensions.ARB_texture_storage_multisample = true;
}
 
if (intel-gen = 5)
-- 
1.8.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 7/7] docs: mark ARB_texture_storage_multisample done

2013-03-15 Thread Chris Forbes
Signed-off-by: Chris Forbes chr...@ijw.co.nz
---
 docs/GL3.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/GL3.txt b/docs/GL3.txt
index de51693..3c97c8d 100644
--- a/docs/GL3.txt
+++ b/docs/GL3.txt
@@ -151,7 +151,7 @@ ARB_shader_storage_buffer_object not 
started
 ARB_stencil_texturingnot started
 ARB_texture_buffer_range DONE (nv50, nvc0)
 ARB_texture_query_levels not started
-ARB_texture_storage_multisample  not started
+ARB_texture_storage_multisample  DONE (i965)
 ARB_texture_view not started
 ARB_vertex_attrib_bindingnot started
 
-- 
1.8.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev