date:20130313

[Mesa-dev] [Bug 61821] src/mesa/drivers/dri/common/xmlpool.h:96:29: fatal error: xmlpool/options.h

2013-03-13 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=61821

Vinson Lee v...@freedesktop.org changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|FIXED   |---

--- Comment #5 from Vinson Lee v...@freedesktop.org ---
mesa: a6bb7a94957468453c436e3860ee2dd47575c461 (master)

This is still failing for me here. Failure case and output is still identical
to comment #0.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] radeonsi: Add compute support v2

2013-03-13 Thread Michel Dänzer

On Die, 2013-03-12 at 17:42 -0400, Alex Deucher wrote: 
 On Tue, Mar 12, 2013 at 4:23 PM, Tom Stellard t...@stellard.net wrote:
  From: Tom Stellard thomas.stell...@amd.com
 
  v2:
- Only dump shaders when env variable is set.
 
 A couple of comments below, other than that, looks good.
 
 Reviewed-by: Alex Deucher alexander.deuc...@amd.com

Likewise,

Reviewed-by: Michel Dänzer michel.daen...@amd.com


  @@ -139,6 +140,11 @@ void si_pm4_inval_texture_cache(struct si_pm4_state 
  *state)
  state-cp_coher_cntl |= S_0085F0_TC_ACTION_ENA(1);
   }
 
  +void si_pm4_inval_texture_l1_cache(struct si_pm4_state *state)
  +{
  +   state-cp_coher_cntl |= S_0085F0_TCL1_ACTION_ENA(1);
  +}
  +
 
 Is there any value in keeping the L1 flush separate?

I don't think so: TC_ACTION_ENA should take care of L1 as well (search
for INVL2 in the register spec).

 Would it make more sense to just add it to si_pm4_inval_texture_cache()?

Yeah, for clarity's sake it might be a good idea to make the above
explicit by adding S_0085F0_TCL1_ACTION_ENA where S_0085F0_TC_ACTION_ENA
is used.


 On a somewhat related note, I'm also not sure it's worth having a
 separate si_pm4_inval_vertex_cache() since there is no VC anymore and
 the function is identical to si_pm4_inval_texture_cache().

Right, I think the main reason I kept it was in case it might help share
more code with r600g again.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast |  Debian, X and DRI developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 61821] src/mesa/drivers/dri/common/xmlpool.h:96:29: fatal error: xmlpool/options.h

2013-03-13 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=61821

--- Comment #6 from Marc marvi...@gmx.de ---
works fine here.

git clean -f -d -x -e b.sh
./autogen.sh\
--prefix=/usr   \
--libdir=/usr/lib64 \
--disable-debug \
--enable-texture-float  \
--enable-gles1  \
--enable-gles2  \
--enable-openvg \
--enable-xorg   \
--enable-xvmc   \
--enable-vdpau  \
--enable-shared-glapi   \
--enable-glx-tls\
--enable-gallium-llvm   \
--enable-gallium-egl\
--enable-gbm\
--enable-gallium-gbm\
--with-gallium-drivers=swrast,r600 \
--with-llvm-prefix=/source/dri-project/llvm/Release \
--with-llvm-shared-libs \
--enable-r600-llvm-compiler \
--with-dri-drivers= \
--with-dri-driverdir=/usr/lib64/dri
make

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] OSMesa VTK

2013-03-13 Thread Kevin H. Hobbs

My tests of nightly VTK built against nightly OSMesa failed last
night.

The onscreen builds are fine.

The OSMesa build tests now fail with all black output but I
haven't found any error message to inform me of what's going on.

I know osmesa just switched to Gallium: that is reflected in
VTK's LoadOpenGLExtension test output :

GL_VERSION: 2.1 Mesa 9.2-devel (git-6173cc1)
GL_RENDERER: Mesa OffScreen

became

GL_VERSION: 2.1 Mesa 9.2.0 (git-f7ef83c)
GL_RENDERER: Gallium 0.4 on llvmpipe (LLVM 3.0, 128 bits)

Mesa was built :

./autogen.sh \
  --prefix=/home/kevin/mesa_nightly \
  --enable-glx \
  --enable-dri \
  --enable-shared-glapi \
  --enable-gallium-llvm \
  --with-gallium-drivers=nouveau,swrast \
  --enable-osmesa




signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] glxgears is faster but 3D render is so slow

2013-03-13 Thread jupiter

Hi Brian,

Sorry for not being clear, let me clarify it.

On 3/13/13, Brian Paul bri...@vmware.com wrote:
 Well, the Xlib/swrast driver does everything in software, unlike a DRI
 driver which does most things with the GPU.  Xlib will always be slower.

My local test machine graphic card does not have hardware
acceleration, it does not support OpenGL, it does not have NVIDIA. I
guess the DRI driver may still implement software GL even though it
might access some basic functions of the low budget graphic card, but
correct me, if I am using wrong terminology.


 The gallium llvmpipe driver should be quite a bit faster.

 Just install LLVM first, then reconfigure/rebuild Mesa, set your
 LD_LIBRARY_PATH to the lib/gallium/ directory and you should get llvmpipe.

Yes, I have already built the Mesalib with LLVM, I posted the
configuration in my last email, let me post it again.

${SOURCE}/${CONFIGURE} --prefix=${INSTALL} --enable-xlib-glx
--disable-dri --enable-gallium-llvm --with-llvm-shared-libs

It did not produce a faster result, it was virtually not much
differences when I run Chimera comparing to use of
--with-gallium-drivers=swrast unless my configuration is wrong. Please
let me know a correct version of configuration for llvmpipe. The
libdrm version is 2.4.42. The libllvm version is 3.2.

I can also use a test program to measure Mesa in different drivers if
you could let me know which test program can be used for benchmarking
the Mesalib using different drivers of swrast, or llvm or DRI? And
where is the test program source code I can download from?

Please also see attached glxinfo for Mesa llvm.

Thank you.

Kind regards,

Jupiter

 -Brian


 On 03/12/2013 07:37 AM, jupiter wrote:
 Hi Brian,

 You are right, setting MESA_GLX_DEPTH_BITS to 24 bit does not change
 anything. So why Xlib has such poor performance to run following
 application?

 http://www.cgl.ucsf.edu/chimera/download.html

 Are there any other things I can try to make Xlib driver performance
 equals to DRI?

 Thank you.

 Kind regards,

 Jupiter




 On 3/12/13, Brian Paulbri...@vmware.com  wrote:
 I don't think you have to worry about the difference in buffer depths.

 If you really want a 24-bit depth buffer you can do 'export
 MESA_GLX_DEPTH_BITS=24'

 -Brian

 On 03/09/2013 12:48 AM, jupiter wrote:
 Hi Brian,

 Please see attached config.log. Le me make a correction, I mean 32
 buffer bit and 24 depth bit in DRI and 24 buffer bit and 16 bit depth
 bit in xlib driver. Will it make difference if setting 32 buffer bit
 and 24 depth bit for xlib? If so, how to do it?

 Thank you.

 Kind regards.

 Jupiter



 On 3/8/13, jupiterjupiter@gmail.com   wrote:
 Hi Brian,

 I finally built Mesa with configuration --enable-xlib-glx
 --disable-dri --enable-gallium-llvm --with-llvm-shared-libs, with
 dependencies of llvm and drm. It does not work either, please see
 following glxinfo. Please let me know if my configuration is not
 correct, or if there are any other ways I can try to make it work.

 $ glxinfo
 name of display: :0.0
 display: :0  screen: 0
 direct rendering: Yes
 server glx vendor string: Brian Paul
 server glx version string: 1.4 Mesa 9.1-devel
 server glx extensions:
   GLX_MESA_copy_sub_buffer, GLX_MESA_pixmap_colormap,
   GLX_MESA_release_buffers, GLX_ARB_get_proc_address,
   GLX_EXT_texture_from_pixmap, GLX_EXT_visual_info,
 GLX_EXT_visual_rating,
   GLX_SGIX_fbconfig, GLX_SGIX_pbuffer
 client glx vendor string: Brian Paul
 client glx version string: 1.4 Mesa 9.1-devel
 client glx extensions:
   GLX_MESA_copy_sub_buffer, GLX_MESA_pixmap_colormap,
   GLX_MESA_release_buffers, GLX_ARB_get_proc_address,
   GLX_EXT_texture_from_pixmap, GLX_EXT_visual_info,
 GLX_EXT_visual_rating,
   GLX_SGIX_fbconfig, GLX_SGIX_pbuffer
 GLX version: 1.4
 GLX extensions:
   GLX_MESA_copy_sub_buffer, GLX_MESA_pixmap_colormap,
   GLX_MESA_release_buffers, GLX_ARB_get_proc_address,
   GLX_EXT_texture_from_pixmap, GLX_EXT_visual_info,
 GLX_EXT_visual_rating,
   GLX_SGIX_fbconfig, GLX_SGIX_pbuffer
 OpenGL vendor string: Brian Paul
 OpenGL renderer string: Mesa X11
 OpenGL version string: 2.1 Mesa 9.1-devel
 OpenGL shading language version string: 1.20
 OpenGL extensions:
   GL_ARB_multisample, GL_EXT_abgr, GL_EXT_bgra,
 GL_EXT_blend_color,
   GL_EXT_blend_minmax, GL_EXT_blend_subtract, GL_EXT_copy_texture,
   GL_EXT_polygon_offset, GL_EXT_subtexture, GL_EXT_texture_object,
   GL_EXT_vertex_array, GL_EXT_compiled_vertex_array,
 GL_EXT_texture,
   GL_EXT_texture3D, GL_IBM_rasterpos_clip,
 GL_ARB_point_parameters,
   GL_EXT_draw_range_elements, GL_EXT_packed_pixels,
 GL_EXT_point_parameters,
   GL_EXT_rescale_normal, GL_EXT_separate_specular_color,
   GL_EXT_texture_edge_clamp, GL_SGIS_generate_mipmap,
   GL_SGIS_texture_border_clamp, GL_SGIS_texture_edge_clamp,
   GL_SGIS_texture_lod, GL_ARB_multitexture,
 GL_IBM_multimode_draw_arrays,

Re: [Mesa-dev] [RFC] GLX_MESA_query_renderer

2013-03-13 Thread Henri Verbeet

On 12 March 2013 17:46, Ian Romanick i...@freedesktop.org wrote:
 Right... the extension also adds an attribute that can only be used with
 glXCreateContextAttribsARB.

Yeah, all I was saying is that it probably wouldn't be too hard to
word things along the lines of If glXCreateContextAttribsARB() isn't
available GLX_RENDERER_ID_MESA goes away, and only one renderer is
available / visible.. Perhaps it's not worth it though.

 My thinking was that it will be very rare for multiple renderers to support
 the same GL versions and different extension strings... at least in a way
 that would cause apps to make different context creation decisions.

I guess that makes sense in the very coarse I need at least GL3 way.

 Part of the thinking is that it would force regularity in how the version is
 advertised.  Otherwise everyone will have a different kind of string, and
 the currently annoying situation of parsing implementation dependent strings
 continues.

 Maybe GLX_RENDERER_VERSION_MESA should also be allowed with
 glXQueryRendererStringMESA?

Yeah, I think that makes sense.

 I also based this on ISV feedback.  Some just wanted to know what the
 hardware was, and others wanted to know that and who made the driver.  I was
 really trying to get away from just parse this random string for as much
 of the API as possible.  It seems like this should only make things easier
 for apps... should.

In theory you could add a GL vendor ID similar to the PCI vendor ID,
but then you'd have to allocate those globally, which would probably
be annoying. So, yeah.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] i965: Apply depthstencil alignment workaround when doing fast clears.

2013-03-13 Thread Paul Berry

On 12 March 2013 12:53, Paul Berry stereotype...@gmail.com wrote:

 On 12 March 2013 12:28, Eric Anholt e...@anholt.net wrote:

 Paul Berry stereotype...@gmail.com writes:

  Fast depth clears have the same depth/stencil alignment requirements
  as other drawing operations.  Therefore, we need to call
  brw_workaround_depthstencil_alignment() from both the clear and
  drawing paths.
 
  Without this fix, we get image corruption if the following conditions
  hold: (a) the first ever drawing operation to a depth miplevel (or the
  first drawing operation after having used the texture for sampling) is
  a clear, (b) the depth miplevel has a size that is eligible for fast
  depth clears, and (c) the depth miplevel has an offset within the
  miptree that isn't 8x8 aligned.
 
  Fixes piglit depthstencil-render-miplevels tests with size 273.
 
  NOTE: This is a candidate for stable branches
  ---
   src/mesa/drivers/dri/i965/brw_clear.c | 6 +-
   1 file changed, 5 insertions(+), 1 deletion(-)
 
  diff --git a/src/mesa/drivers/dri/i965/brw_clear.c
 b/src/mesa/drivers/dri/i965/brw_clear.c
  index 53d8e54..cde1a06 100644
  --- a/src/mesa/drivers/dri/i965/brw_clear.c
  +++ b/src/mesa/drivers/dri/i965/brw_clear.c
  @@ -40,6 +40,8 @@
   #include intel_mipmap_tree.h
   #include intel_regions.h
 
  +#include brw_context.h
  +
   #define FILE_DEBUG_FLAG DEBUG_BLIT
 
   static const char *buffer_names[] = {
  @@ -219,7 +221,8 @@ brw_fast_clear_depth(struct gl_context *ctx)
   static void
   brw_clear(struct gl_context *ctx, GLbitfield mask)
   {
  -   struct intel_context *intel = intel_context(ctx);
  +   struct brw_context *brw = brw_context(ctx);
  +   struct intel_context *intel = brw-intel;
 
  if (!_mesa_check_conditional_render(ctx))
 return;
  @@ -229,6 +232,7 @@ brw_clear(struct gl_context *ctx, GLbitfield mask)
  }
 
  intel_prepare_render(intel);
  +   brw_workaround_depthstencil_alignment(brw);

 It seems like this should be happening in brw_fast_clear(), either
 before before calling blorp or inside of it, instead of in the potential
 caller of brw_fast_clear().  Makes sense, though.


 Chad made the same comment to me in person yesterday.  The reason I put it
 here is to accommodate patch 2/2 (which allows
 brw_workaround_depthstencil_alignment to avoid an unnecessary copy when
 clearing the whole miplevel).  If I move the call to
 brw_workaround_depthstencil_alignment into brw_fast_clear_depth(), then the
 unnecessary copy will only be avoided when doing depth clears.  If I leave
 it here, the unnecessary copy will be avoided for all clears.


Correction: when I wrote this I momentarily forgot that the workaround is
only needed for depth and stencil buffers.  So leaving the call to
brw_workaround_depthstencil_alignment here allows us to avoid the
unnecessary copy for both depth and stencil clears, not just depth clears.
 I still think it's worth it, but it's a far less convincing case.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/4] i965: Remove fixed-function texture projection avoidance optimization.

2013-03-13 Thread Adrian M Negreanu

Hi,

I have tested your changes but it looks like they fail to compile on Android.


==
Tested the patch(es) on top of the following commits:
f7ef83c scons: Define PACKAGE_xxx
6f86b93 docs: rewrite the OSMesa info / instructions
79eac7d configure: wire-up new OSMesa gallium state tracker and target
be51f12 target/osmesa: add new Makefile.am
94263da targets/osmesa: new OSMesa gallium target
7114b6a st/osmesa: add new Makefile.am
73436a9 st/osmesa: new OSMesa gallium state tracker

===
Failed to build for android

f7ef83c scons: Define PACKAGE_xxx
6f86b93 docs: rewrite the OSMesa info / instructions
79eac7d configure: wire-up new OSMesa gallium state tracker and target
be51f12 target/osmesa: add new Makefile.am
94263da targets/osmesa: new OSMesa gallium target
7114b6a st/osmesa: add new Makefile.am
73436a9 st/osmesa: new OSMesa gallium state tracker
src/mesa/drivers/dri/i965/brw_state_dump.c: In function
'dump_depth_stencil_state':
src/mesa/drivers/dri/i965/brw_state_dump.c:373:71: warning: pointer of
type 'void *' used in arithmetic [-Wpointer-arith]
src/mesa/drivers/dri/i965/brw_state_dump.c: In function 'dump_cc_state_gen6':
src/mesa/drivers/dri/i965/brw_state_dump.c:407:68: warning: pointer of
type 'void *' used in arithmetic [-Wpointer-arith]
src/mesa/drivers/dri/i965/brw_state_dump.c: In function 'dump_scissor':
src/mesa/drivers/dri/i965/brw_state_dump.c:436:65: warning: pointer of
type 'void *' used in arithmetic [-Wpointer-arith]
src/mesa/drivers/dri/i965/brw_state_dump.c: In function 'dump_vs_constants':
src/mesa/drivers/dri/i965/brw_state_dump.c:449:49: warning: pointer of
type 'void *' used in arithmetic [-Wpointer-arith]
src/mesa/drivers/dri/i965/brw_state_dump.c:450:47: warning: pointer of
type 'void *' used in arithmetic [-Wpointer-arith]
src/mesa/drivers/dri/i965/brw_state_dump.c: In function 'dump_wm_constants':
src/mesa/drivers/dri/i965/brw_state_dump.c:466:49: warning: pointer of
type 'void *' used in arithmetic [-Wpointer-arith]
src/mesa/drivers/dri/i965/brw_state_dump.c:467:47: warning: pointer of
type 'void *' used in arithmetic [-Wpointer-arith]
src/mesa/drivers/dri/i965/brw_state_dump.c: In function 'dump_binding_table':
src/mesa/drivers/dri/i965/brw_state_dump.c:483:50: warning: pointer of
type 'void *' used in arithmetic [-Wpointer-arith]
src/mesa/drivers/dri/i965/brw_state_dump.c: In function 'dump_prog_cache':
src/mesa/drivers/dri/i965/brw_state_dump.c:511:33: warning: pointer of
type 'void *' used in arithmetic [-Wpointer-arith]
src/mesa/drivers/dri/i965/brw_vs_surface_state.c: In function
'brw_upload_vs_pull_constants':
src/mesa/drivers/dri/i965/brw_vs_surface_state.c:77:40: warning:
pointer of type 'void *' used in arithmetic [-Wpointer-arith]
src/mesa/main/errors.c: In function '_mesa_problem':
src/mesa/main/errors.c:851:15: error: 'PACKAGE_VERSION' undeclared
(first use in this function)
src/mesa/main/errors.c:851:15: note: each undeclared identifier is
reported only once for each function it appears in
src/mesa/main/errors.c:852:43: error: expected ')' before 'PACKAGE_BUGREPORT'
make: *** 
[out/target/product/samsungxe700t/obj/STATIC_LIBRARIES/libmesa_dricore_intermediates/main/errors.o]
Error 1
FAILURE



--
Regards, Aiaiai
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] DRI2: don't advertise GLX_INTEL_swap_event if it can't

2013-03-13 Thread Rob Clark

(switching over mesa-dev.. sent to the wrong list initially)

On Wed, Mar 13, 2013 at 8:25 AM, Paul Menzel
paulepan...@users.sourceforge.net wrote:
 Dear Rob,


 Am Dienstag, den 12.03.2013, 19:44 -0400 schrieb Rob Clark:

 »it« sounds  strange in commit summary.

 If ddx does not support swap, don't advertise it.

Hmm, yeah.. I somehow was having trouble coming up with something short enough


 So how is `dri2BindExtensions` changed. Some things passed beforehand
 are already available in `struct dri2_screen *psc`?

yeah, I suppose I didn't have to remove the extensions arg, but the
code seemed a bit cleaner this way and was trying to avoid
dri2BindExtensions() growing to a huge # of args

 Are bugs fixed by this or did you find this reading through the code?

yes, with DRI2: Don't disable GLX_INTEL_swap_event unconditionally
and without this patch, gnome-shell (and probably I guess anything
built on clutter) will be broken for ddx drivers which don't support
swap.  I noticed this when rebasing freedreno to latest mesa (since
currently I have no good kernel interface for page flipping, so I only
advertise DRI2 1.1 (DRI2InfoRec version==3).

 We might also be able to get rid of the vmwgfx check (I'm not quite
 sure the purpose of that check vs. just checking dri2Minor.

 Missing »)«.

oh, whoops.. well that is easy enough to fix at least

BR,
-R


 Signed-off-by: Rob Clark robdcl...@gmail.com
 ---
  src/glx/dri2_glx.c | 12 
  1 file changed, 8 insertions(+), 4 deletions(-)

 diff --git a/src/glx/dri2_glx.c b/src/glx/dri2_glx.c
 index c4f6996..b2d712c 100644
 --- a/src/glx/dri2_glx.c
 +++ b/src/glx/dri2_glx.c
 @@ -1051,11 +1051,16 @@ static const struct glx_context_vtable 
 dri2_context_vtable = {
  };

  static void
 -dri2BindExtensions(struct dri2_screen *psc, const __DRIextension 
 **extensions,
 +dri2BindExtensions(struct dri2_screen *psc, struct glx_display * priv,

 No space after the * in `* priv`?

 const char *driverName)
  {
 +   const struct dri2_display *const pdp = (struct dri2_display *)
 +  priv-dri2Display;
 +   const __DRIextension **extensions;
 int i;

 +   extensions = psc-core-getExtensions(psc-driScreen);
 +
 __glXEnableDirectExtension(psc-base, GLX_SGI_video_sync);
 __glXEnableDirectExtension(psc-base, GLX_SGI_swap_control);
 __glXEnableDirectExtension(psc-base, GLX_MESA_swap_control);
 @@ -1069,7 +1074,7 @@ dri2BindExtensions(struct dri2_screen *psc, const 
 __DRIextension **extensions,
  * of disabling it uncondtionally, just disable it for drivers
  * which are known to not support it.
  */
 -   if (strcmp(driverName, vmwgfx) != 0) {
 +   if (pdp-swapAvailable  strcmp(driverName, vmwgfx) != 0) {
__glXEnableDirectExtension(psc-base, GLX_INTEL_swap_event);
 }

 @@ -1212,8 +1217,7 @@ dri2CreateScreen(int screen, struct glx_display * priv)
goto handle_error;
 }

 -   extensions = psc-core-getExtensions(psc-driScreen);
 -   dri2BindExtensions(psc, extensions, driverName);
 +   dri2BindExtensions(psc, priv, driverName);

 configs = driConvertConfigs(psc-core, psc-base.configs, 
 driver_configs);
 visuals = driConvertConfigs(psc-core, psc-base.visuals, 
 driver_configs);


 Thanks,

 Paul
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] glxgears is faster but 3D render is so slow

2013-03-13 Thread Brian Paul


On 03/13/2013 04:11 AM, jupiter wrote:

Hi Brian,

Sorry for not being clear, let me clarify it.

On 3/13/13, Brian Paulbri...@vmware.com  wrote:

Well, the Xlib/swrast driver does everything in software, unlike a DRI
driver which does most things with the GPU.  Xlib will always be slower.


My local test machine graphic card does not have hardware
acceleration, it does not support OpenGL, it does not have NVIDIA. I
guess the DRI driver may still implement software GL even though it
might access some basic functions of the low budget graphic card, but
correct me, if I am using wrong terminology.


Yes, swrast may also be used via DRI, when there's no DRI driver for 
the hardware GPU or when the hardware driver needs a software fallback.




The gallium llvmpipe driver should be quite a bit faster.

Just install LLVM first, then reconfigure/rebuild Mesa, set your
LD_LIBRARY_PATH to the lib/gallium/ directory and you should get llvmpipe.


Yes, I have already built the Mesalib with LLVM, I posted the
configuration in my last email, let me post it again.

${SOURCE}/${CONFIGURE} --prefix=${INSTALL} --enable-xlib-glx
--disable-dri --enable-gallium-llvm --with-llvm-shared-libs

It did not produce a faster result, it was virtually not much
differences when I run Chimera comparing to use of
--with-gallium-drivers=swrast unless my configuration is wrong. Please
let me know a correct version of configuration for llvmpipe. The
libdrm version is 2.4.42. The libllvm version is 3.2.


libdrm is irrelevant for llvmpipe.



I can also use a test program to measure Mesa in different drivers if
you could let me know which test program can be used for benchmarking
the Mesalib using different drivers of swrast, or llvm or DRI? And
where is the test program source code I can download from?


The Mesa demos git tree can be cloned per 
http://www.mesa3d.org/repository.html




Please also see attached glxinfo for Mesa llvm.


The key line is OpenGL renderer string: Mesa X11.  That's not 
llvmpipe.  If you're really using llvmpipe it should say something 
like OpenGL renderer string: Gallium 0.4 on llvmpipe (LLVM 3.2, 128 
bits).


Perhaps your LD_LIBRARY_PATH env is pointing at the wrong libGL.so

-Brian
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] OSMesa VTK

2013-03-13 Thread Brian Paul


On 03/13/2013 04:04 AM, Kevin H. Hobbs wrote:

My tests of nightly VTK built against nightly OSMesa failed last
night.

The onscreen builds are fine.

The OSMesa build tests now fail with all black output but I
haven't found any error message to inform me of what's going on.

I know osmesa just switched to Gallium: that is reflected in
VTK's LoadOpenGLExtension test output :

GL_VERSION: 2.1 Mesa 9.2-devel (git-6173cc1)
GL_RENDERER: Mesa OffScreen

 became

GL_VERSION: 2.1 Mesa 9.2.0 (git-f7ef83c)
GL_RENDERER: Gallium 0.4 on llvmpipe (LLVM 3.0, 128 bits)

Mesa was built :

./autogen.sh \
   --prefix=/home/kevin/mesa_nightly \
   --enable-glx \
   --enable-dri \
   --enable-shared-glapi \
   --enable-gallium-llvm \
   --with-gallium-drivers=nouveau,swrast \
   --enable-osmesa


I just rebuilt with those options (but a different prefix) and OSMesa 
seems OK here.


Can you tell me what the parameters are for your OSMesaCreateContext() 
and OSMesaMakeCurrent() calls (in particular the image format/type)?


You could also try setting GALLIUM_DRIVER=softpipe and see if the 
softpipe driver works.


-Brian
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 58718] Crash in src_register() during glClear() call

2013-03-13 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=58718

--- Comment #15 from José Fonseca jfons...@vmware.com ---
(In reply to comment #14)
 I posted a series of patches to mesa3d-dev which seems to fix the inline
 issue.

I pushed these now, the most important being

commit 57cd1d1454653f778837eec0ee5d4060bc59c5ba
Author: José Fonseca jfons...@vmware.com
Date:   Tue Mar 12 20:37:47 2013 +

include: Fix build with VS 11 (i.e, 2012).

NOTE: Candidate for the stable branches.

Reviewed-by: Brian Paul bri...@vmware.com

After rebuilding Mesa with VS 2012 the framebuffer2.trace no longer crashes.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Mesa (master): mesa, gallium, egl, mapi: One definition of C99 inline/ func to rule them all.

2013-03-13 Thread Jose Fonseca

I think Vinson fixed that issue. Let me know if there are still build issues.

Jose

- Original Message -
 Sorry. I'll look into it. I did try running make locally, but it was not
 representative because it didn't have everything enabled.
 
 BTW, scons is also busted with the recent autotools changes.
 
 Jose
 
 - Original Message -
  I'm pretty sure this commit broke 'make check'.
  
  On 03/12/2013 03:07 PM, Jose Fonseca wrote:
   Module: Mesa
   Branch: master
   Commit: 70fe7c6d3e1c7534f6598c4616bebf672f42668b
   URL:
   https://urldefense.proofpoint.com/v1/url?u=http://cgit.freedesktop.org/mesa/mesa/commit/?id%3D70fe7c6d3e1c7534f6598c4616bebf672f42668bk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=NMr9uy2iTjWVixC0wOcYCWEIYhfo80qKwRgdodpoDzA%3D%0Am=H8wIUjX2Q9Vap177sJPdRYKZbkm3kW0pnqa8bxRkM9I%3D%0As=c39db3454f1b8b1ca49172fd338781131296d92aa521862672d8bf538fbb786a
  
   Author: José Fonseca jfons...@vmware.com
   Date:   Tue Mar 12 11:17:49 2013 +
  
   mesa,gallium,egl,mapi: One definition of C99 inline/__func__ to rule them
   all.
  
   We were in four already...
  
   NOTE: Candidate for the stable branches.
  
   Reviewed-by: Brian Paul bri...@vmware.com
  
   ---
  
 include/c99_compat.h  |  105
 +
 src/egl/main/eglcompiler.h|   44 ++
 src/gallium/include/pipe/p_compiler.h |   74 ++-
 src/mapi/mapi/u_compiler.h|   26 +---
 src/mesa/main/compiler.h  |   56 ++
 5 files changed, 125 insertions(+), 180 deletions(-)
  
   diff --git a/include/c99_compat.h b/include/c99_compat.h
   new file mode 100644
   index 000..39f958f
   --- /dev/null
   +++ b/include/c99_compat.h
   @@ -0,0 +1,105 @@
   +/**
   + *
   + * Copyright 2007-2013 VMware, Inc.
   + * All Rights Reserved.
   + *
   + * Permission is hereby granted, free of charge, to any person obtaining
   a
   + * copy of this software and associated documentation files (the
   + * Software), to deal in the Software without restriction, including
   + * without limitation the rights to use, copy, modify, merge, publish,
   + * distribute, sub license, and/or sell copies of the Software, and to
   + * permit persons to whom the Software is furnished to do so, subject to
   + * the following conditions:
   + *
   + * The above copyright notice and this permission notice (including the
   + * next paragraph) shall be included in all copies or substantial
   portions
   + * of the Software.
   + *
   + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND,
   EXPRESS
   + * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
   + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
   NON-INFRINGEMENT.
   + * IN NO EVENT SHALL VMWARE AND/OR ITS SUPPLIERS BE LIABLE FOR
   + * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF
   CONTRACT,
   + * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
   + * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
   + *
   +
   **/
   +
   +#ifndef _C99_COMPAT_H_
   +#define _C99_COMPAT_H_
   +
   +
   +/*
   + * C99 inline keyword
   + */
   +#ifndef inline
   +#  ifdef __cplusplus
   + /* C++ supports inline keyword */
   +#  elif defined(__GNUC__)
   +#define inline __inline__
   +#  elif defined(_MSC_VER)
   +#define inline __inline
   +#  elif defined(__ICL)
   +#define inline __inline
   +#  elif defined(__INTEL_COMPILER)
   + /* Intel compiler supports inline keyword */
   +#  elif defined(__WATCOMC__)  (__WATCOMC__ = 1100)
   +#define inline __inline
   +#  elif defined(__SUNPRO_C)  defined(__C99FEATURES__)
   + /* C99 supports inline keyword */
   +#  elif (__STDC_VERSION__ = 199901L)
   + /* C99 supports inline keyword */
   +#  else
   +#define inline
   +#  endif
   +#endif
   +
   +
   +/*
   + * C99 restrict keyword
   + *
   + * See also:
   + * -
   https://urldefense.proofpoint.com/v1/url?u=http://cellperformance.beyond3d.com/articles/2006/05/demystifying-the-restrict-keyword.htmlk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=NMr9uy2iTjWVixC0wOcYCWEIYhfo80qKwRgdodpoDzA%3D%0Am=H8wIUjX2Q9Vap177sJPdRYKZbkm3kW0pnqa8bxRkM9I%3D%0As=f43184c4b720b2a3a361edbfbdffd2faf83def468e6171389b627cae6991baf4
   + */
   +#ifndef restrict
   +#  if (__STDC_VERSION__ = 199901L)
   + /* C99 */
   +#  elif defined(__SUNPRO_C)  defined(__C99FEATURES__)
   + /* C99 */
   +#  elif defined(__GNUC__)
   +#define restrict __restrict__
   +#  elif defined(_MSC_VER)
   +#define restrict __restrict
   +#  else
   +#define restrict /* */
   +#  endif
   +#endif
   +
   +
   +/*
   + * C99 __func__ macro
   + */
   +#ifndef __func__
   +#  if (__STDC_VERSION__ = 199901L)
   + /* C99 */
   +#  elif defined(__SUNPRO_C)

Re: [Mesa-dev] [PATCH 10/12] Get rid of _mesa_vert_result_to_frag_attrib().

2013-03-13 Thread Eric Anholt

Paul Berry stereotype...@gmail.com writes:

 Now that there is no difference between the enums that represent
 vertex outputs and fragment inputs, there's no need for a conversion
 function.  But we still need to be able to detect when a given vertex
 output has no corresponding fragment input.  So it is replaced by a
 new function, _mesa_varying_slot_in_fs(), which tells whether the
 given varying slot exists as an FS input or not.
 ---
  src/mesa/drivers/dri/i965/brw_fs.cpp| 12 -
  src/mesa/drivers/dri/i965/brw_vs_constval.c | 13 --
  src/mesa/main/mtypes.h  | 38 
 +
  3 files changed, 27 insertions(+), 36 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs.cpp
 index 86f8cbb..ea4a56c 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
 @@ -1265,7 +1265,7 @@ fs_visitor::calculate_urb_setup()
  continue;
  
if (c-key.vp_outputs_written  BITFIELD64_BIT(i)) {
 - int fp_index = _mesa_vert_result_to_frag_attrib((gl_varying_slot) 
 i);
 +bool exists_in_fs = _mesa_varying_slot_in_fs((gl_varying_slot) 
 i);

I'd rather see this call moved into the single usage in the if statement
below, like has been done elsewhere (now that the function name
explicitly talks about what's being tested in the if anyway)

   /* The back color slot is skipped when the front color is
* also written to.  In addition, some slots can be
 @@ -1273,8 +1273,8 @@ fs_visitor::calculate_urb_setup()
* fragment shader.  So the register number must always be
* incremented, mapped or not.
*/
 - if (fp_index = 0)
 -urb_setup[fp_index] = urb_next;
 + if (exists_in_fs)
 +urb_setup[i] = urb_next;
  urb_next++;


  /**
 - * Convert from a gl_varying_slot value for a vertex output to the
 - * corresponding gl_frag_attrib.
 - *
 - * Varying output values which have no corresponding gl_frag_attrib
 - * (VARYING_SLOT_PSIZ, VARYING_SLOT_BFC0, VARYING_SLOT_BFC1, and
 - * VARYING_SLOT_EDGE) are converted to a value of -1.
 + * Determine if the given gl_varying_slot appears in the fragment shader.
   */
 -static inline int
 -_mesa_vert_result_to_frag_attrib(gl_varying_slot vert_result)
 +static inline GLboolean
 +_mesa_varying_slot_in_fs(gl_varying_slot slot)
  {
 -   if (vert_result = VARYING_SLOT_TEX7)
 -  return vert_result;
 -   else if (vert_result  VARYING_SLOT_CLIP_DIST0)
 -  return -1;
 -   else if (vert_result = VARYING_SLOT_CLIP_DIST1)
 -  return vert_result;
 -   else if (vert_result  VARYING_SLOT_VAR0)
 -  return -1;
 -   else
 -  return vert_result;
 +   switch (slot) {
 +   case VARYING_SLOT_PSIZ:
 +   case VARYING_SLOT_BFC0:
 +   case VARYING_SLOT_BFC1:
 +   case VARYING_SLOT_EDGE:
 +   case VARYING_SLOT_CLIP_VERTEX:
 +   case VARYING_SLOT_LAYER:
 +  return GL_FALSE;
 +   default:
 +  return GL_TRUE;
 +   }
  }

I bet the compiler does a big switch statement instead of doing what we
could do better with bitfields.  Not a blocker, just a potential
improvement.

Other than that, I'm glad to see this series happen.

Reviewed-by: Eric Anholt e...@anholt.net


pgpodX4xwJpRR.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] DRI2: don't advertise GLX_INTEL_swap_event if it can't

2013-03-13 Thread Zack Rusin

 well, I'm more familiar w/ EGL where we don't have the xserver
 advertising anything, and it is all on the client side.. but when it
 is an inexpensive check, it seems reasonable to want mesa to do the
 right thing where possible. 

It's simply silly. In the same sense that adding yet another if (ptr) to if 
(ptr) if (ptr) FREE(ptr); while not technically wrong is simply silly. Like I 
said we already check whether those extensions are advertised by the server and 
don't advertise the ones that aren't.

 Probably there are other cases where we
 should do the same thing.  I can update my patch to also exclude other
 extensions

No, the point it that we don't want to do that. It's fundamentally broken and 
you know that it's broken because you'll notice that this extension is still 
advertised by the server (for our sake that's all required to fix Clutter, but 
it's still broken). It's a weird thing for an extension which is implemented by 
the server to be advertised by the server and yet having a client which is 
essentially not involved at all, not be advertising it. The only reason we have 
to worry about this is that the server is broken. So while we might want to 
make things easier on us by not forcing users to keep repatching the Xserver we 
shouldn't have any illusions about what this is: it's a nasty hack required by 
a bug in the Xserver. As such that code has only two requirements:
1) That all drivers requiring that hack go through the same codepath and that 
it's as minimal as possible so it's trivial to remove it once a fixed Xserver 
gets into most distros.
2) That it's clearly documented as hack thanks to which anyone reading this 
code will immediately understand what's the purpose of the weird code and what 
are the prerequisites for removing it.
Everything else is of no consequence in this case. So whether you'll decide to 
use names or some any number of other extensions that came after dri2inforec 
version 4 to check for makes no difference as long as it fulfills the two above 
goals. 

 true, it is not shipping in any distro yet, so anyone who wants to try
 it gets to try git master of mesa, which runs into problems because of
 advertising the INTEL_swap extension.  Asking everyone to rebuild
 xserver with some extra patch which is not merged yet is a big pita.

Sure, but at the same time adding hacks to shared mesa code to make it easier 
to try a dev driver doesn't make terribly convincing argument. In the end 
though, at least in this case, the bug is severe enough that a hack in mesa 
makes sense and we've spent too much time discussing a very simple issue, so 
whatever you do just please make sure to fulfill the two requirements above and 
everything will be ok.

z
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] i965: Apply depthstencil alignment workaround when doing fast clears.

2013-03-13 Thread Eric Anholt

Paul Berry stereotype...@gmail.com writes:

 On 12 March 2013 12:53, Paul Berry stereotype...@gmail.com wrote:

 On 12 March 2013 12:28, Eric Anholt e...@anholt.net wrote:

 Paul Berry stereotype...@gmail.com writes:

  Fast depth clears have the same depth/stencil alignment requirements
  as other drawing operations.  Therefore, we need to call
  brw_workaround_depthstencil_alignment() from both the clear and
  drawing paths.
 
  Without this fix, we get image corruption if the following conditions
  hold: (a) the first ever drawing operation to a depth miplevel (or the
  first drawing operation after having used the texture for sampling) is
  a clear, (b) the depth miplevel has a size that is eligible for fast
  depth clears, and (c) the depth miplevel has an offset within the
  miptree that isn't 8x8 aligned.
 
  Fixes piglit depthstencil-render-miplevels tests with size 273.
 
  NOTE: This is a candidate for stable branches
  ---
   src/mesa/drivers/dri/i965/brw_clear.c | 6 +-
   1 file changed, 5 insertions(+), 1 deletion(-)
 
  diff --git a/src/mesa/drivers/dri/i965/brw_clear.c
 b/src/mesa/drivers/dri/i965/brw_clear.c
  index 53d8e54..cde1a06 100644
  --- a/src/mesa/drivers/dri/i965/brw_clear.c
  +++ b/src/mesa/drivers/dri/i965/brw_clear.c
  @@ -40,6 +40,8 @@
   #include intel_mipmap_tree.h
   #include intel_regions.h
 
  +#include brw_context.h
  +
   #define FILE_DEBUG_FLAG DEBUG_BLIT
 
   static const char *buffer_names[] = {
  @@ -219,7 +221,8 @@ brw_fast_clear_depth(struct gl_context *ctx)
   static void
   brw_clear(struct gl_context *ctx, GLbitfield mask)
   {
  -   struct intel_context *intel = intel_context(ctx);
  +   struct brw_context *brw = brw_context(ctx);
  +   struct intel_context *intel = brw-intel;
 
  if (!_mesa_check_conditional_render(ctx))
 return;
  @@ -229,6 +232,7 @@ brw_clear(struct gl_context *ctx, GLbitfield mask)
  }
 
  intel_prepare_render(intel);
  +   brw_workaround_depthstencil_alignment(brw);

 It seems like this should be happening in brw_fast_clear(), either
 before before calling blorp or inside of it, instead of in the potential
 caller of brw_fast_clear().  Makes sense, though.


 Chad made the same comment to me in person yesterday.  The reason I put it
 here is to accommodate patch 2/2 (which allows
 brw_workaround_depthstencil_alignment to avoid an unnecessary copy when
 clearing the whole miplevel).  If I move the call to
 brw_workaround_depthstencil_alignment into brw_fast_clear_depth(), then the
 unnecessary copy will only be avoided when doing depth clears.  If I leave
 it here, the unnecessary copy will be avoided for all clears.


 Correction: when I wrote this I momentarily forgot that the workaround is
 only needed for depth and stencil buffers.  So leaving the call to
 brw_workaround_depthstencil_alignment here allows us to avoid the
 unnecessary copy for both depth and stencil clears, not just depth clears.
  I still think it's worth it, but it's a far less convincing case.

You convinced me, though.

Reviewed-by: Eric Anholt e...@anholt.net for both.


pgp91Ike56JRF.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] DRI2: don't advertise GLX_INTEL_swap_event if it can't

2013-03-13 Thread Rob Clark

On Wed, Mar 13, 2013 at 11:19 AM, Zack Rusin za...@vmware.com wrote:
 well, I'm more familiar w/ EGL where we don't have the xserver
 advertising anything, and it is all on the client side.. but when it
 is an inexpensive check, it seems reasonable to want mesa to do the
 right thing where possible.

 It's simply silly. In the same sense that adding yet another if (ptr) to if 
 (ptr) if (ptr) FREE(ptr); while not technically wrong is simply silly. Like 
 I said we already check whether those extensions are advertised by the server 
 and don't advertise the ones that aren't.


well, if the other component that provided FREE() had a bug that it
didn't check for null, it wouldn't be completely silly.  But still a
hack/workaround..

 Probably there are other cases where we
 should do the same thing.  I can update my patch to also exclude other
 extensions

 No, the point it that we don't want to do that. It's fundamentally broken and 
 you know that it's broken because you'll notice that this extension is still 
 advertised by the server (for our sake that's all required to fix Clutter, 
 but it's still broken). It's a weird thing for an extension which is 
 implemented by the server to be advertised by the server and yet having a 
 client which is essentially not involved at all, not be advertising it. The 
 only reason we have to worry about this is that the server is broken. So 
 while we might want to make things easier on us by not forcing users to keep 
 repatching the Xserver we shouldn't have any illusions about what this is: 
 it's a nasty hack required by a bug in the Xserver. As such that code has 
 only two requirements:
 1) That all drivers requiring that hack go through the same codepath and that 
 it's as minimal as possible so it's trivial to remove it once a fixed Xserver 
 gets into most distros.
 2) That it's clearly documented as hack thanks to which anyone reading this 
 code will immediately understand what's the purpose of the weird code and 
 what are the prerequisites for removing it.
 Everything else is of no consequence in this case. So whether you'll decide 
 to use names or some any number of other extensions that came after 
 dri2inforec version 4 to check for makes no difference as long as it fulfills 
 the two above goals.


I'm ok with documenting it as a hack, and removing it once updated
xserver is in most distro's.  But it does seem useful to have at least
in the short term.

 true, it is not shipping in any distro yet, so anyone who wants to try
 it gets to try git master of mesa, which runs into problems because of
 advertising the INTEL_swap extension.  Asking everyone to rebuild
 xserver with some extra patch which is not merged yet is a big pita.

 Sure, but at the same time adding hacks to shared mesa code to make it easier 
 to try a dev driver doesn't make terribly convincing argument. In the end 
 though, at least in this case, the bug is severe enough that a hack in mesa 
 makes sense and we've spent too much time discussing a very simple issue, so 
 whatever you do just please make sure to fulfill the two requirements above 
 and everything will be ok.


true, I suppose.. although there are currently enough challenges
getting proper linux running on some of these devices, I don't really
like to make it harder than it already is.

I'll re-submit the patch making it more clear that it is a hack.  I
think point #1 is already met, it is a pretty localized hack and
should be easy to remove later.

BR,
-R

 z
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] DRI2: HACK: no GLX_INTEL_swap_event if no ScheduleSwap

2013-03-13 Thread Rob Clark

If ddx does not support swap, don't advertise it.  This is a hack to
work around current xservers which advertise this extension even when it
is clearly not supported.  When:

http://lists.x.org/archives/xorg-devel/2013-February/035449.html

is merged in upstream xserver and makes it's way into most distros then
this hack can be removed.  In the mean time, it is required to allow
gnome-shell/clutter/etc to work properly with a DDX driver which does
not support ScheduleSwap.

Signed-off-by: Rob Clark robdcl...@gmail.com
---
 src/glx/dri2_glx.c | 21 +++--
 1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/src/glx/dri2_glx.c b/src/glx/dri2_glx.c
index c4f6996..7ce5775 100644
--- a/src/glx/dri2_glx.c
+++ b/src/glx/dri2_glx.c
@@ -1051,11 +1051,16 @@ static const struct glx_context_vtable 
dri2_context_vtable = {
 };
 
 static void
-dri2BindExtensions(struct dri2_screen *psc, const __DRIextension **extensions,
+dri2BindExtensions(struct dri2_screen *psc, struct glx_display * priv,
const char *driverName)
 {
+   const struct dri2_display *const pdp = (struct dri2_display *)
+  priv-dri2Display;
+   const __DRIextension **extensions;
int i;
 
+   extensions = psc-core-getExtensions(psc-driScreen);
+
__glXEnableDirectExtension(psc-base, GLX_SGI_video_sync);
__glXEnableDirectExtension(psc-base, GLX_SGI_swap_control);
__glXEnableDirectExtension(psc-base, GLX_MESA_swap_control);
@@ -1066,10 +1071,15 @@ dri2BindExtensions(struct dri2_screen *psc, const 
__DRIextension **extensions,
 * currently unconditionally enabled. This completely breaks
 * systems running on drivers which don't support that extension.
 * There's no way to test for its presence on this side, so instead
-* of disabling it uncondtionally, just disable it for drivers
-* which are known to not support it.
+* of disabling it unconditionally, just disable it for drivers
+* which are known to not support it, or for DDX drivers supporting
+* only an older (pre-ScheduleSwap) version of DRI2.
+*
+* This is a hack which is required until:
+* http://lists.x.org/archives/xorg-devel/2013-February/035449.html
+* is merged and updated xserver makes it's way into distros:
 */
-   if (strcmp(driverName, vmwgfx) != 0) {
+   if (pdp-swapAvailable  strcmp(driverName, vmwgfx) != 0) {
   __glXEnableDirectExtension(psc-base, GLX_INTEL_swap_event);
}
 
@@ -1212,8 +1222,7 @@ dri2CreateScreen(int screen, struct glx_display * priv)
   goto handle_error;
}
 
-   extensions = psc-core-getExtensions(psc-driScreen);
-   dri2BindExtensions(psc, extensions, driverName);
+   dri2BindExtensions(psc, priv, driverName);
 
configs = driConvertConfigs(psc-core, psc-base.configs, driver_configs);
visuals = driConvertConfigs(psc-core, psc-base.visuals, driver_configs);
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/4] i965: Remove fixed-function texture projection avoidance optimization.

2013-03-13 Thread Eric Anholt

Kenneth Graunke kenn...@whitecape.org writes:

 This optimization attempts to avoid extra attribute interpolation
 instructions for texture coordinates where the W-component is 1.0.

 Unfortunately, it requires a lot of complexity: the brw_wm_input_sizes
 state atom (all the brw_vs_constval.c code) needs to run on each draw.
 It computes the input_size_masks array, then uses that to compute
 proj_attrib_mask.  Differences in proj_attrib_mask can cause
 state-dependent fragment shader recompiles.  We also often fail to guess
 proj_attrib_mask for the fragment shader precompile, causing us to
 needlessly compile it twice.

 Furthermore, this optimization only applies to fixed-function programs;
 it does not help modern GLSL-based programs at all.  Generally, older
 fixed-function programs run fine on modern hardware anyway.

 The optimization has existed in some form since the initial commit.  When
 we rewrote the fragment shader backend, we dropped it for a while.  Eric
 readded it in commit eb30820f268608cf451da32de69723036dddbc62 as part of
 an attempt to cure a ~1% performance regression caused by converting the
 fixed-function fragment shader generation code from Mesa IR to GLSL IR.
 However, no performance data was included in the commit message, so it's
 unclear whether or not it was successful.

 Time has passed, so I decided to re-measure this.  Surprisingly,
 Eric's OpenArena timedemo actually runs /faster/ after removing this and
 the brw_wm_input_sizes atom.  On Ivybridge at 1024x768, I measured a
 1.39532% +/- 0.91833% increase in FPS (n = 55).

Removing it on SNB+ makes sense to me.  But given the higher cost of
math pre-gen6, I think we should test on one of those too.


pgpJbb84JTksL.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/3] Fix glapi/tests/check_table.cpp for standardized OpenGL function names

2013-03-13 Thread Ian Romanick


On 02/27/2013 08:36 AM, Jon TURNEY wrote:

It looks like this has been broken since commit
1a1db1746db82efc7f0643508886dfc78a15eb71 Standardize names of OpenGL
functions.


As far as I can tell, I run this test with every build, and I've never 
seen it fail.  The ARB names are part of the Linux ABI, so if they're 
not working, something is catastrophically broken... changing the test 
just ignores the problem.



Signed-off-by: Jon TURNEY jon.tur...@dronecode.org.uk
---
  src/mapi/glapi/tests/check_table.cpp |  528 +-
  1 files changed, 264 insertions(+), 264 deletions(-)

diff --git a/src/mapi/glapi/tests/check_table.cpp 
b/src/mapi/glapi/tests/check_table.cpp
index 807d3c3..dffec83 100644
--- a/src/mapi/glapi/tests/check_table.cpp
+++ b/src/mapi/glapi/tests/check_table.cpp
@@ -523,40 +523,40 @@ const struct name_offset linux_gl_abi[] = {
 { glTexImage3D, 371 },
 { glTexSubImage3D, 372 },
 { glCopyTexSubImage3D, 373 },
-   { glActiveTextureARB, 374 },
-   { glClientActiveTextureARB, 375 },
-   { glMultiTexCoord1dARB, 376 },
-   { glMultiTexCoord1dvARB, 377 },
+   { glActiveTexture, 374 },
+   { glClientActiveTexture, 375 },
+   { glMultiTexCoord1d, 376 },
+   { glMultiTexCoord1dv, 377 },
 { glMultiTexCoord1fARB, 378 },
 { glMultiTexCoord1fvARB, 379 },
-   { glMultiTexCoord1iARB, 380 },
-   { glMultiTexCoord1ivARB, 381 },
-   { glMultiTexCoord1sARB, 382 },
-   { glMultiTexCoord1svARB, 383 },
-   { glMultiTexCoord2dARB, 384 },
-   { glMultiTexCoord2dvARB, 385 },
+   { glMultiTexCoord1i, 380 },
+   { glMultiTexCoord1iv, 381 },
+   { glMultiTexCoord1s, 382 },
+   { glMultiTexCoord1sv, 383 },
+   { glMultiTexCoord2d, 384 },
+   { glMultiTexCoord2dv, 385 },
 { glMultiTexCoord2fARB, 386 },
 { glMultiTexCoord2fvARB, 387 },
-   { glMultiTexCoord2iARB, 388 },
-   { glMultiTexCoord2ivARB, 389 },
-   { glMultiTexCoord2sARB, 390 },
-   { glMultiTexCoord2svARB, 391 },
-   { glMultiTexCoord3dARB, 392 },
-   { glMultiTexCoord3dvARB, 393 },
+   { glMultiTexCoord2i, 388 },
+   { glMultiTexCoord2iv, 389 },
+   { glMultiTexCoord2s, 390 },
+   { glMultiTexCoord2sv, 391 },
+   { glMultiTexCoord3d, 392 },
+   { glMultiTexCoord3dv, 393 },
 { glMultiTexCoord3fARB, 394 },
 { glMultiTexCoord3fvARB, 395 },
-   { glMultiTexCoord3iARB, 396 },
-   { glMultiTexCoord3ivARB, 397 },
-   { glMultiTexCoord3sARB, 398 },
-   { glMultiTexCoord3svARB, 399 },
-   { glMultiTexCoord4dARB, 400 },
-   { glMultiTexCoord4dvARB, 401 },
+   { glMultiTexCoord3i, 396 },
+   { glMultiTexCoord3iv, 397 },
+   { glMultiTexCoord3s, 398 },
+   { glMultiTexCoord3sv, 399 },
+   { glMultiTexCoord4d, 400 },
+   { glMultiTexCoord4dv, 401 },
 { glMultiTexCoord4fARB, 402 },
 { glMultiTexCoord4fvARB, 403 },
-   { glMultiTexCoord4iARB, 404 },
-   { glMultiTexCoord4ivARB, 405 },
-   { glMultiTexCoord4sARB, 406 },
-   { glMultiTexCoord4svARB, 407 },
+   { glMultiTexCoord4i, 404 },
+   { glMultiTexCoord4iv, 405 },
+   { glMultiTexCoord4s, 406 },
+   { glMultiTexCoord4sv, 407 },
 { NULL, 0 }
  };

@@ -937,40 +937,40 @@ const struct name_offset known_dispatch[] = {
 { glTexImage3D, _O(TexImage3D) },
 { glTexSubImage3D, _O(TexSubImage3D) },
 { glCopyTexSubImage3D, _O(CopyTexSubImage3D) },
-   { glActiveTextureARB, _O(ActiveTextureARB) },
-   { glClientActiveTextureARB, _O(ClientActiveTextureARB) },
-   { glMultiTexCoord1dARB, _O(MultiTexCoord1dARB) },
-   { glMultiTexCoord1dvARB, _O(MultiTexCoord1dvARB) },
+   { glActiveTexture, _O(ActiveTexture) },
+   { glClientActiveTexture, _O(ClientActiveTexture) },
+   { glMultiTexCoord1d, _O(MultiTexCoord1d) },
+   { glMultiTexCoord1dv, _O(MultiTexCoord1dv) },
 { glMultiTexCoord1fARB, _O(MultiTexCoord1fARB) },
 { glMultiTexCoord1fvARB, _O(MultiTexCoord1fvARB) },
-   { glMultiTexCoord1iARB, _O(MultiTexCoord1iARB) },
-   { glMultiTexCoord1ivARB, _O(MultiTexCoord1ivARB) },
-   { glMultiTexCoord1sARB, _O(MultiTexCoord1sARB) },
-   { glMultiTexCoord1svARB, _O(MultiTexCoord1svARB) },
-   { glMultiTexCoord2dARB, _O(MultiTexCoord2dARB) },
-   { glMultiTexCoord2dvARB, _O(MultiTexCoord2dvARB) },
+   { glMultiTexCoord1i, _O(MultiTexCoord1i) },
+   { glMultiTexCoord1iv, _O(MultiTexCoord1iv) },
+   { glMultiTexCoord1s, _O(MultiTexCoord1s) },
+   { glMultiTexCoord1sv, _O(MultiTexCoord1sv) },
+   { glMultiTexCoord2d, _O(MultiTexCoord2d) },
+   { glMultiTexCoord2dv, _O(MultiTexCoord2dv) },
 { glMultiTexCoord2fARB, _O(MultiTexCoord2fARB) },
 { glMultiTexCoord2fvARB, _O(MultiTexCoord2fvARB) },
-   { glMultiTexCoord2iARB, _O(MultiTexCoord2iARB) },
-   { glMultiTexCoord2ivARB, _O(MultiTexCoord2ivARB) },
-   { glMultiTexCoord2sARB, _O(MultiTexCoord2sARB) },
-   { glMultiTexCoord2svARB, _O(MultiTexCoord2svARB) },
-   { glMultiTexCoord3dARB, _O(MultiTexCoord3dARB) },
-   { glMultiTexCoord3dvARB, _O(MultiTexCoord3dvARB) },
+   { glMultiTexCoord2i, _O(MultiTexCoord2i) },
+   { glMultiTexCoord2iv, _O(MultiTexCoord2iv) },
+   { glMultiTexCoord2s,

Re: [Mesa-dev] [PATCH 1/4] i965: Remove fixed-function texture projection avoidance optimization.

2013-03-13 Thread Kenneth Graunke


On 03/13/2013 06:07 AM, Adrian M Negreanu wrote:

Hi,

I have tested your changes but it looks like they fail to compile on Android.


==
Tested the patch(es) on top of the following commits:
f7ef83c scons: Define PACKAGE_xxx
6f86b93 docs: rewrite the OSMesa info / instructions
79eac7d configure: wire-up new OSMesa gallium state tracker and target
be51f12 target/osmesa: add new Makefile.am
94263da targets/osmesa: new OSMesa gallium target
7114b6a st/osmesa: add new Makefile.am
73436a9 st/osmesa: new OSMesa gallium state tracker

===
Failed to build for android

f7ef83c scons: Define PACKAGE_xxx
6f86b93 docs: rewrite the OSMesa info / instructions
79eac7d configure: wire-up new OSMesa gallium state tracker and target
be51f12 target/osmesa: add new Makefile.am
94263da targets/osmesa: new OSMesa gallium target
7114b6a st/osmesa: add new Makefile.am
73436a9 st/osmesa: new OSMesa gallium state tracker
src/mesa/drivers/dri/i965/brw_state_dump.c: In function
'dump_depth_stencil_state':
src/mesa/drivers/dri/i965/brw_state_dump.c:373:71: warning: pointer of
type 'void *' used in arithmetic [-Wpointer-arith]
src/mesa/drivers/dri/i965/brw_state_dump.c: In function 'dump_cc_state_gen6':
src/mesa/drivers/dri/i965/brw_state_dump.c:407:68: warning: pointer of
type 'void *' used in arithmetic [-Wpointer-arith]
src/mesa/drivers/dri/i965/brw_state_dump.c: In function 'dump_scissor':
src/mesa/drivers/dri/i965/brw_state_dump.c:436:65: warning: pointer of
type 'void *' used in arithmetic [-Wpointer-arith]
src/mesa/drivers/dri/i965/brw_state_dump.c: In function 'dump_vs_constants':
src/mesa/drivers/dri/i965/brw_state_dump.c:449:49: warning: pointer of
type 'void *' used in arithmetic [-Wpointer-arith]
src/mesa/drivers/dri/i965/brw_state_dump.c:450:47: warning: pointer of
type 'void *' used in arithmetic [-Wpointer-arith]
src/mesa/drivers/dri/i965/brw_state_dump.c: In function 'dump_wm_constants':
src/mesa/drivers/dri/i965/brw_state_dump.c:466:49: warning: pointer of
type 'void *' used in arithmetic [-Wpointer-arith]
src/mesa/drivers/dri/i965/brw_state_dump.c:467:47: warning: pointer of
type 'void *' used in arithmetic [-Wpointer-arith]
src/mesa/drivers/dri/i965/brw_state_dump.c: In function 'dump_binding_table':
src/mesa/drivers/dri/i965/brw_state_dump.c:483:50: warning: pointer of
type 'void *' used in arithmetic [-Wpointer-arith]
src/mesa/drivers/dri/i965/brw_state_dump.c: In function 'dump_prog_cache':
src/mesa/drivers/dri/i965/brw_state_dump.c:511:33: warning: pointer of
type 'void *' used in arithmetic [-Wpointer-arith]
src/mesa/drivers/dri/i965/brw_vs_surface_state.c: In function
'brw_upload_vs_pull_constants':
src/mesa/drivers/dri/i965/brw_vs_surface_state.c:77:40: warning:
pointer of type 'void *' used in arithmetic [-Wpointer-arith]
src/mesa/main/errors.c: In function '_mesa_problem':
src/mesa/main/errors.c:851:15: error: 'PACKAGE_VERSION' undeclared
(first use in this function)


That's definitely Matt's build system changes, not my code.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Mesa (master): mesa, gallium, egl, mapi: One definition of C99 inline/ func to rule them all.

2013-03-13 Thread Ian Romanick


On 03/13/2013 01:14 AM, Jose Fonseca wrote:

Sorry. I'll look into it. I did try running make locally, but it was not 
representative because it didn't have everything enabled.

BTW, scons is also busted with the recent autotools changes.


Blarg.  Of course it did. :(  Has the break been reported to the guilty 
party?  I haven't been following any of the build system discussion very 
closely over the last few weeks...



Jose

- Original Message -

I'm pretty sure this commit broke 'make check'.

On 03/12/2013 03:07 PM, Jose Fonseca wrote:

Module: Mesa
Branch: master
Commit: 70fe7c6d3e1c7534f6598c4616bebf672f42668b
URL:
https://urldefense.proofpoint.com/v1/url?u=http://cgit.freedesktop.org/mesa/mesa/commit/?id%3D70fe7c6d3e1c7534f6598c4616bebf672f42668bk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=NMr9uy2iTjWVixC0wOcYCWEIYhfo80qKwRgdodpoDzA%3D%0Am=H8wIUjX2Q9Vap177sJPdRYKZbkm3kW0pnqa8bxRkM9I%3D%0As=c39db3454f1b8b1ca49172fd338781131296d92aa521862672d8bf538fbb786a

Author: José Fonseca jfons...@vmware.com
Date:   Tue Mar 12 11:17:49 2013 +

mesa,gallium,egl,mapi: One definition of C99 inline/__func__ to rule them
all.

We were in four already...

NOTE: Candidate for the stable branches.

Reviewed-by: Brian Paul bri...@vmware.com

---

   include/c99_compat.h  |  105
   +
   src/egl/main/eglcompiler.h|   44 ++
   src/gallium/include/pipe/p_compiler.h |   74 ++-
   src/mapi/mapi/u_compiler.h|   26 +---
   src/mesa/main/compiler.h  |   56 ++
   5 files changed, 125 insertions(+), 180 deletions(-)

diff --git a/include/c99_compat.h b/include/c99_compat.h
new file mode 100644
index 000..39f958f
--- /dev/null
+++ b/include/c99_compat.h
@@ -0,0 +1,105 @@
+/**
+ *
+ * Copyright 2007-2013 VMware, Inc.
+ * All Rights Reserved.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the
+ * Software), to deal in the Software without restriction, including
+ * without limitation the rights to use, copy, modify, merge, publish,
+ * distribute, sub license, and/or sell copies of the Software, and to
+ * permit persons to whom the Software is furnished to do so, subject to
+ * the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the
+ * next paragraph) shall be included in all copies or substantial portions
+ * of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS
+ * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
+ * IN NO EVENT SHALL VMWARE AND/OR ITS SUPPLIERS BE LIABLE FOR
+ * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF
CONTRACT,
+ * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
+ * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+ *
+
**/
+
+#ifndef _C99_COMPAT_H_
+#define _C99_COMPAT_H_
+
+
+/*
+ * C99 inline keyword
+ */
+#ifndef inline
+#  ifdef __cplusplus
+ /* C++ supports inline keyword */
+#  elif defined(__GNUC__)
+#define inline __inline__
+#  elif defined(_MSC_VER)
+#define inline __inline
+#  elif defined(__ICL)
+#define inline __inline
+#  elif defined(__INTEL_COMPILER)
+ /* Intel compiler supports inline keyword */
+#  elif defined(__WATCOMC__)  (__WATCOMC__ = 1100)
+#define inline __inline
+#  elif defined(__SUNPRO_C)  defined(__C99FEATURES__)
+ /* C99 supports inline keyword */
+#  elif (__STDC_VERSION__ = 199901L)
+ /* C99 supports inline keyword */
+#  else
+#define inline
+#  endif
+#endif
+
+
+/*
+ * C99 restrict keyword
+ *
+ * See also:
+ * -
https://urldefense.proofpoint.com/v1/url?u=http://cellperformance.beyond3d.com/articles/2006/05/demystifying-the-restrict-keyword.htmlk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=NMr9uy2iTjWVixC0wOcYCWEIYhfo80qKwRgdodpoDzA%3D%0Am=H8wIUjX2Q9Vap177sJPdRYKZbkm3kW0pnqa8bxRkM9I%3D%0As=f43184c4b720b2a3a361edbfbdffd2faf83def468e6171389b627cae6991baf4
+ */
+#ifndef restrict
+#  if (__STDC_VERSION__ = 199901L)
+ /* C99 */
+#  elif defined(__SUNPRO_C)  defined(__C99FEATURES__)
+ /* C99 */
+#  elif defined(__GNUC__)
+#define restrict __restrict__
+#  elif defined(_MSC_VER)
+#define restrict __restrict
+#  else
+#define restrict /* */
+#  endif
+#endif
+
+
+/*
+ * C99 __func__ macro
+ */
+#ifndef __func__
+#  if (__STDC_VERSION__ = 199901L)
+ /* C99 */
+#  elif defined(__SUNPRO_C)  defined(__C99FEATURES__)
+ /* C99 */
+#  elif defined(__GNUC__)
+#if __GNUC__ = 2
+#  define __func__ __FUNCTION__
+#else
+#  define __func__ unknown
+#endif
+#  elif defined(_MSC_VER)
+#if _MSC_VER = 1300
+#  define __func__

Re: [Mesa-dev] Mesa (master): mesa, gallium, egl, mapi: One definition of C99 inline/ func to rule them all.

2013-03-13 Thread Jose Fonseca

- Original Message -
 On 03/13/2013 01:14 AM, Jose Fonseca wrote:
  Sorry. I'll look into it. I did try running make locally, but it was not
  representative because it didn't have everything enabled.
 
  BTW, scons is also busted with the recent autotools changes.
 
 Blarg.  Of course it did. :(  Has the break been reported to the guilty
 party?  

It's fixed now. It was simple stuff.

 I haven't been following any of the build system discussion very
 closely over the last few weeks...

I also tend to glean over autotools review requests, as I'm not very familiar 
with it, but I keep forgeting to look out for impact into scons.

Anyway, no biggie, at least as far I'm concerned.  It's so hard to build mesa 
with all possible build systems, platforms, and required dependencies, so I 
don't think it's anybody fault builds get broken every now and then, just the 
nature of the beast.  As long as the build gets fixed in a timely fashion, then 
bisecting through the past shouldn't be too painful. 

Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] Fix for dispatch table entries and es2-compatibility mode

2013-03-13 Thread Ian Romanick

Please don't reply off list.  Subscribe to the list and participate.  We 
can't make decisions that affect everyone behind closed doors.


On 03/13/2013 02:32 AM, Zawistowski, Bartosz L wrote:

Hi Ian,

Thank you for quick feedback. This fix is aimed at separating EXT and
ARB framebuffer_object extensions. According to GL spec entry points
of these two extensions need to have different implementations. There
is no such distinction now in DRI drivers nor in glapi interface.
This patch allows other DRI drivers (not delivered with mesa) that
use glapi to take advantage of both extensions. In order to satisfy
compilation of mesa DRI drivers, name wrappers have been provided for
new api calls.


We've had some discussion about this on the mesa-dev list before.  You 
should search the archives.  I believe the only functions that are 
different are glBindRenderbuffer and glBindFramebuffer.  All of the 
other functions have the same GLX protocol opcode, so they are 
indistinguishable in indirect rendering.


Splitting functions other than glBind{Frame,Render}buffer is definitely 
wrong.



Could you please provide more details regarding compatibility issues
between libGL and existing DRI drivers you have mentioned?


The driver asks libGL where various functions belong in the dispatch 
table.  Right now, drivers only ask for one of the names and assume all 
variations will go through the same dispatch entry.  With this change, 
existing driver binaries won't set the dispatch pointer for 
glBindFramebuffer (only for glBindFramebufferEXT), for example.


Since the same mechanism is used inside the xserver, changing the 
drivers would also require changes in the server's GLX code.


What we have done is implement the ARB behavior for both versions.  If 
you ask 100 developers what the difference is, I would bet at least 97 
respond, There's a difference? :)  And the other 3 will wonder what 
the point of using non-Gen names is.


In short... Yes, there's a bug, but fixing it makes things worse for 
very little, if any, actual benefit.


The bigger thing to worry about is that glBindFramebuffer in OpenGL ES 
2.0 (and 3.0) behaves different from glBindFramebuffer in desktop OpenGL 
3.0.  Since you know which API was used to create the context, just 
select the behavior in the implementation of glBindFramebuffer based on 
the API.


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] Google Summer of Code ideas needed

2013-03-13 Thread Tom Stellard

Hi,

It's time again for Google Summer of Code, so we need to start updating
the X.Org ideas page (http://www.x.org/wiki/SummerOfCodeIdeas) with new
ideas.  Since there have been a few issues with the wikis lately, if you
have any ideas please respond to this thread, and I will make sure they
get onto the official ideas page (but still feel free to update the wiki
page yourself if you can).  A good project description should contain:

- A brief description of the project
- A difficulty rating (e.g. easy, medium, hard)
- The skills / programming languages required

Also, I am going to purge all the old ideas from the ideas page in the
next week, so if there are any of the old ideas that you think are
still relevant, let me know and I will keep it.

The ideas page is used as one of the criteria by Google for selecting
mentoring organizations and part of the reason X.Org was not selected
last year was that the ideas page was not up to par, so if we want to
participate in Google Summer of Code this year, it is important we
have a good ideas page with lots of ideas.

Thanks,
Tom Stellard
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] i965: Split shader_time entries into separate cachelines.

2013-03-13 Thread Eric Anholt

This avoids some snooping overhead between EUs processing separate shaders
(so VS versus FS).

Improves performance of a minecraft trace with shader_time by 28.9% +/-
18.3% (n=7), and performance of my old GLSL demo by 93.7% +/- 0.8% (n=4).

v2: Add a define for the stride with a comment explaining its units and
why.
---
 src/mesa/drivers/dri/i965/brw_context.h |8 
 src/mesa/drivers/dri/i965/brw_fs.cpp|2 +-
 src/mesa/drivers/dri/i965/brw_program.c |5 +++--
 src/mesa/drivers/dri/i965/brw_vec4.cpp  |2 +-
 4 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index c34d6b1..d042dd6 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -571,6 +571,14 @@ struct brw_vs_prog_data {
 #define SURF_INDEX_SOL_BINDING(t)((t))
 #define BRW_MAX_GS_SURFACES  
SURF_INDEX_SOL_BINDING(BRW_MAX_SOL_BINDINGS)
 
+/**
+ * Stride in bytes between shader_time entries.
+ *
+ * We separate entries by a cacheline to reduce traffic between EUs writing to
+ * different entries.
+ */
+#define SHADER_TIME_STRIDE 64
+
 enum brw_cache_id {
BRW_BLEND_STATE,
BRW_DEPTH_STENCIL_STATE,
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 8ce3954..8476bb5 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -621,7 +621,7 @@ fs_visitor::emit_shader_time_write(enum 
shader_time_shader_type type,
 
fs_reg offset_mrf = fs_reg(MRF, base_mrf);
offset_mrf.type = BRW_REGISTER_TYPE_UD;
-   emit(MOV(offset_mrf, fs_reg(shader_time_index * 4)));
+   emit(MOV(offset_mrf, fs_reg(shader_time_index * SHADER_TIME_STRIDE)));
 
fs_reg time_mrf = fs_reg(MRF, base_mrf + 1);
time_mrf.type = BRW_REGISTER_TYPE_UD;
diff --git a/src/mesa/drivers/dri/i965/brw_program.c 
b/src/mesa/drivers/dri/i965/brw_program.c
index 75eb6bc..62954d3 100644
--- a/src/mesa/drivers/dri/i965/brw_program.c
+++ b/src/mesa/drivers/dri/i965/brw_program.c
@@ -228,7 +228,8 @@ brw_init_shader_time(struct brw_context *brw)
 
const int max_entries = 4096;
brw-shader_time.bo = drm_intel_bo_alloc(intel-bufmgr, shader time,
-max_entries * 4, 4096);
+max_entries * SHADER_TIME_STRIDE,
+4096);
brw-shader_time.programs = rzalloc_array(brw, struct gl_shader_program *,
  max_entries);
brw-shader_time.types = rzalloc_array(brw, enum shader_time_shader_type,
@@ -409,7 +410,7 @@ brw_collect_shader_time(struct brw_context *brw)
uint32_t *times = brw-shader_time.bo-virtual;
 
for (int i = 0; i  brw-shader_time.num_entries; i++) {
-  brw-shader_time.cumulative[i] += times[i];
+  brw-shader_time.cumulative[i] += times[i * SHADER_TIME_STRIDE / 4];
}
 
/* Zero the BO out to clear it out for our next collection.
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index f319f32..d759710 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -1225,7 +1225,7 @@ vec4_visitor::emit_shader_time_write(enum 
shader_time_shader_type type,
 
dst_reg offset_mrf = dst_reg(MRF, base_mrf);
offset_mrf.type = BRW_REGISTER_TYPE_UD;
-   emit(MOV(offset_mrf, src_reg(shader_time_index * 4)));
+   emit(MOV(offset_mrf, src_reg(shader_time_index * SHADER_TIME_STRIDE)));
 
dst_reg time_mrf = dst_reg(MRF, base_mrf + 1);
time_mrf.type = BRW_REGISTER_TYPE_UD;
-- 
1.7.10.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/4] i965: Remove fixed-function texture projection avoidance optimization.

2013-03-13 Thread Adrian M Negreanu

On Mar 13, 2013 9:25 AM, Kenneth Graunke kenn...@whitecape.org wrote:

 On 03/13/2013 06:07 AM, Adrian M Negreanu wrote:

 Hi,

 I have tested your changes but it looks like they fail to compile on
Android.


 ==
 Tested the patch(es) on top of the following commits:
 f7ef83c scons: Define PACKAGE_xxx
 6f86b93 docs: rewrite the OSMesa info / instructions
 79eac7d configure: wire-up new OSMesa gallium state tracker and target
 be51f12 target/osmesa: add new Makefile.am
 94263da targets/osmesa: new OSMesa gallium target
 7114b6a st/osmesa: add new Makefile.am
 73436a9 st/osmesa: new OSMesa gallium state tracker

 ===
 Failed to build for android

 f7ef83c scons: Define PACKAGE_xxx
 6f86b93 docs: rewrite the OSMesa info / instructions
 79eac7d configure: wire-up new OSMesa gallium state tracker and target
 be51f12 target/osmesa: add new Makefile.am
 94263da targets/osmesa: new OSMesa gallium target
 7114b6a st/osmesa: add new Makefile.am
 73436a9 st/osmesa: new OSMesa gallium state tracker
 src/mesa/drivers/dri/i965/brw_state_dump.c: In function
 'dump_depth_stencil_state':
 src/mesa/drivers/dri/i965/brw_state_dump.c:373:71: warning: pointer of
 type 'void *' used in arithmetic [-Wpointer-arith]
 src/mesa/drivers/dri/i965/brw_state_dump.c: In function
'dump_cc_state_gen6':
 src/mesa/drivers/dri/i965/brw_state_dump.c:407:68: warning: pointer of
 type 'void *' used in arithmetic [-Wpointer-arith]
 src/mesa/drivers/dri/i965/brw_state_dump.c: In function 'dump_scissor':
 src/mesa/drivers/dri/i965/brw_state_dump.c:436:65: warning: pointer of
 type 'void *' used in arithmetic [-Wpointer-arith]
 src/mesa/drivers/dri/i965/brw_state_dump.c: In function
'dump_vs_constants':
 src/mesa/drivers/dri/i965/brw_state_dump.c:449:49: warning: pointer of
 type 'void *' used in arithmetic [-Wpointer-arith]
 src/mesa/drivers/dri/i965/brw_state_dump.c:450:47: warning: pointer of
 type 'void *' used in arithmetic [-Wpointer-arith]
 src/mesa/drivers/dri/i965/brw_state_dump.c: In function
'dump_wm_constants':
 src/mesa/drivers/dri/i965/brw_state_dump.c:466:49: warning: pointer of
 type 'void *' used in arithmetic [-Wpointer-arith]
 src/mesa/drivers/dri/i965/brw_state_dump.c:467:47: warning: pointer of
 type 'void *' used in arithmetic [-Wpointer-arith]
 src/mesa/drivers/dri/i965/brw_state_dump.c: In function
'dump_binding_table':
 src/mesa/drivers/dri/i965/brw_state_dump.c:483:50: warning: pointer of
 type 'void *' used in arithmetic [-Wpointer-arith]
 src/mesa/drivers/dri/i965/brw_state_dump.c: In function
'dump_prog_cache':
 src/mesa/drivers/dri/i965/brw_state_dump.c:511:33: warning: pointer of
 type 'void *' used in arithmetic [-Wpointer-arith]
 src/mesa/drivers/dri/i965/brw_vs_surface_state.c: In function
 'brw_upload_vs_pull_constants':
 src/mesa/drivers/dri/i965/brw_vs_surface_state.c:77:40: warning:
 pointer of type 'void *' used in arithmetic [-Wpointer-arith]
 src/mesa/main/errors.c: In function '_mesa_problem':
 src/mesa/main/errors.c:851:15: error: 'PACKAGE_VERSION' undeclared
 (first use in this function)


 That's definitely Matt's build system changes, not my code.

It was an automated message but after checking the merged patches I've seen
Matt patch on replacing the MESA PACKAGE variable. Apologies for the wrong
report.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] OSMesa VTK

2013-03-13 Thread Kevin H. Hobbs

On 03/13/2013 10:40 AM, Brian Paul wrote:
 
 Can you tell me what the parameters are for your OSMesaCreateContext() 
 and OSMesaMakeCurrent() calls (in particular the image format/type)?

I do not know. I'll do some digging into the VTK source and ask on the
VTK list.

 
 You could also try setting GALLIUM_DRIVER=softpipe and see if the 
 softpipe driver works.
 


The test still fails, however it fails differently, instead of a black
image I get a strip of colored snow at the top of the output image...

How about I just show you?

The output (boring black image) of last night's VTK LoadOpenGLExtension
test which used llvmpipe is here :

http://open.cdash.org/testDetails.php?test=180871306build=2844006

The output of:

$ env GALLIUM_DRIVER=softpipe \
  /home/kevin/kitware/VTK_OSMesa_Build/bin/vtkRenderingOpenGLCxxTests \
  LoadOpenGLExtension -D /home/kevin/kitware/VTKData \
  -T /home/kevin/kitware/VTK_OSMesa_Build/Testing/Temporary \
  -V Baseline/Rendering/LoadOpenGLExtension.png

is here :

 http://crab-lab.zool.ohiou.edu/kevin/LoadOpenGLExtension.png



signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/4] gallium: PIPE_COMPUTE_CAP_IR_TARGET - allow drivers to specify a processor v2

2013-03-13 Thread Tom Stellard

From: Tom Stellard thomas.stell...@amd.com

This target string now contains four values instead of three.  The old
processor field (which was really being interpreted as arch) has been split
into two fields: processor and arch.  This allows drivers to pass a
more a more detailed description of the hardware to compiler frontends.

v2:
  - Adapt to libclc changes
---
 src/gallium/docs/source/screen.rst |8 +-
 src/gallium/drivers/r600/r600_llvm.c   |   63 -
 src/gallium/drivers/r600/r600_llvm.h   |2 -
 src/gallium/drivers/r600/r600_pipe.c   |   74 ++-
 src/gallium/drivers/r600/r600_pipe.h   |2 +
 src/gallium/drivers/radeonsi/radeonsi_pipe.c   |   11 +++
 src/gallium/drivers/radeonsi/radeonsi_pipe.h   |1 +
 src/gallium/drivers/radeonsi/radeonsi_shader.c |4 +-
 .../state_trackers/clover/llvm/invocation.cpp  |   18 --
 9 files changed, 104 insertions(+), 79 deletions(-)

diff --git a/src/gallium/docs/source/screen.rst 
b/src/gallium/docs/source/screen.rst
index 68d1a35..10836f1 100644
--- a/src/gallium/docs/source/screen.rst
+++ b/src/gallium/docs/source/screen.rst
@@ -222,10 +222,10 @@ PIPE_COMPUTE_CAP_*
 Compute-specific capabilities. They can be queried using
 pipe_screen::get_compute_param.
 
-* ``PIPE_COMPUTE_CAP_IR_TARGET``: A description of the target as a target
-  triple specification of the form ``processor-manufacturer-os`` that will
-  be passed on to the compiler.  This CAP is only relevant for drivers
-  that specify PIPE_SHADER_IR_LLVM for their preferred IR.
+* ``PIPE_COMPUTE_CAP_IR_TARGET``: A description of the target of the form
+  ``processor-arch-manufacturer-os`` that will be passed on to the compiler.
+  This CAP is only relevant for drivers that specify PIPE_SHADER_IR_LLVM for
+  their preferred IR.
   Value type: null-terminated string.
 * ``PIPE_COMPUTE_CAP_GRID_DIMENSION``: Number of supported dimensions
   for grid and block coordinates.  Value type: ``uint64_t``.
diff --git a/src/gallium/drivers/r600/r600_llvm.c 
b/src/gallium/drivers/r600/r600_llvm.c
index 042193c..1552ccb 100644
--- a/src/gallium/drivers/r600/r600_llvm.c
+++ b/src/gallium/drivers/r600/r600_llvm.c
@@ -561,69 +561,6 @@ LLVMModuleRef r600_tgsi_llvm(
return ctx-gallivm.module;
 }
 
-const char * r600_llvm_gpu_string(enum radeon_family family)
-{
-   const char * gpu_family;
-
-   switch (family) {
-   case CHIP_R600:
-   case CHIP_RV610:
-   case CHIP_RV630:
-   case CHIP_RV620:
-   case CHIP_RV635:
-   case CHIP_RV670:
-   case CHIP_RS780:
-   case CHIP_RS880:
-   gpu_family = r600;
-   break;
-   case CHIP_RV710:
-   gpu_family = rv710;
-   break;
-   case CHIP_RV730:
-   gpu_family = rv730;
-   break;
-   case CHIP_RV740:
-   case CHIP_RV770:
-   gpu_family = rv770;
-   break;
-   case CHIP_PALM:
-   case CHIP_CEDAR:
-   gpu_family = cedar;
-   break;
-   case CHIP_SUMO:
-   case CHIP_SUMO2:
-   case CHIP_REDWOOD:
-   gpu_family = redwood;
-   break;
-   case CHIP_JUNIPER:
-   gpu_family = juniper;
-   break;
-   case CHIP_HEMLOCK:
-   case CHIP_CYPRESS:
-   gpu_family = cypress;
-   break;
-   case CHIP_BARTS:
-   gpu_family = barts;
-   break;
-   case CHIP_TURKS:
-   gpu_family = turks;
-   break;
-   case CHIP_CAICOS:
-   gpu_family = caicos;
-   break;
-   case CHIP_CAYMAN:
-case CHIP_ARUBA:
-   gpu_family = cayman;
-   break;
-   default:
-   gpu_family = ;
-   fprintf(stderr, Chip not supported by r600 llvm 
-   backend, please file a bug at  PACKAGE_BUGREPORT 
\n);
-   break;
-   }
-   return gpu_family;
-}
-
 unsigned r600_llvm_compile(
LLVMModuleRef mod,
unsigned char ** inst_bytes,
diff --git a/src/gallium/drivers/r600/r600_llvm.h 
b/src/gallium/drivers/r600/r600_llvm.h
index 090d909..b5e2af2 100644
--- a/src/gallium/drivers/r600/r600_llvm.h
+++ b/src/gallium/drivers/r600/r600_llvm.h
@@ -15,8 +15,6 @@ LLVMModuleRef r600_tgsi_llvm(
struct radeon_llvm_context * ctx,
const struct tgsi_token * tokens);
 
-const char * r600_llvm_gpu_string(enum radeon_family family);
-
 unsigned r600_llvm_compile(
LLVMModuleRef mod,
unsigned char ** inst_bytes,
diff --git a/src/gallium/drivers/r600/r600_pipe.c 
b/src/gallium/drivers/r600/r600_pipe.c
index 60a0247..66dac62 100644
--- a/src/gallium/drivers/r600/r600_pipe.c
+++ b/src/gallium/drivers/r600/r600_pipe.c
@@ -760,18 +760,84 @@ static int r600_get_video_param(struct pipe_screen 
*screen,
}
 }
 
+const char *

[Mesa-dev] [PATCH 2/4] radeonsi: Remove si_pm4_inval_vertex_cache()

2013-03-13 Thread Tom Stellard

From: Tom Stellard thomas.stell...@amd.com

This function is a holdover from r600g and is identical to
si_pm4_inval_texture_cache(), so it is not needed.
---
 src/gallium/drivers/radeonsi/radeonsi_pm4.c  |6 --
 src/gallium/drivers/radeonsi/radeonsi_pm4.h  |1 -
 src/gallium/drivers/radeonsi/si_state_draw.c |2 +-
 3 files changed, 1 insertions(+), 8 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/radeonsi_pm4.c 
b/src/gallium/drivers/radeonsi/radeonsi_pm4.c
index 79a2521..9a884f7 100644
--- a/src/gallium/drivers/radeonsi/radeonsi_pm4.c
+++ b/src/gallium/drivers/radeonsi/radeonsi_pm4.c
@@ -139,12 +139,6 @@ void si_pm4_inval_texture_cache(struct si_pm4_state *state)
state-cp_coher_cntl |= S_0085F0_TC_ACTION_ENA(1);
 }
 
-void si_pm4_inval_vertex_cache(struct si_pm4_state *state)
-{
-/* Some GPUs don't have the vertex cache and must use the texture 
cache instead. */
-   state-cp_coher_cntl |= S_0085F0_TC_ACTION_ENA(1);
-}
-
 void si_pm4_inval_fb_cache(struct si_pm4_state *state, unsigned nr_cbufs)
 {
state-cp_coher_cntl |= S_0085F0_CB_ACTION_ENA(1);
diff --git a/src/gallium/drivers/radeonsi/radeonsi_pm4.h 
b/src/gallium/drivers/radeonsi/radeonsi_pm4.h
index 2ad62d6..bdeb930 100644
--- a/src/gallium/drivers/radeonsi/radeonsi_pm4.h
+++ b/src/gallium/drivers/radeonsi/radeonsi_pm4.h
@@ -75,7 +75,6 @@ void si_pm4_sh_data_end(struct si_pm4_state *state, unsigned 
base, unsigned idx)
 
 void si_pm4_inval_shader_cache(struct si_pm4_state *state);
 void si_pm4_inval_texture_cache(struct si_pm4_state *state);
-void si_pm4_inval_vertex_cache(struct si_pm4_state *state);
 void si_pm4_inval_fb_cache(struct si_pm4_state *state, unsigned nr_cbufs);
 void si_pm4_inval_zsbuf_cache(struct si_pm4_state *state);
 
diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
b/src/gallium/drivers/radeonsi/si_state_draw.c
index 1049d2b..b78f20a 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -416,7 +416,7 @@ static void si_vertex_buffer_update(struct r600_context 
*rctx)
unsigned i, count;
uint64_t va;
 
-   si_pm4_inval_vertex_cache(pm4);
+   si_pm4_inval_texture_cache(pm4);
 
/* bind vertex buffer once */
count = rctx-vertex_elements-count;
-- 
1.7.3.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/4] radeonsi: Use TCL1_ACTION_ENA when invalidating the texture cache

2013-03-13 Thread Tom Stellard

From: Tom Stellard thomas.stell...@amd.com

---
 src/gallium/drivers/radeonsi/radeonsi_pm4.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/radeonsi_pm4.c 
b/src/gallium/drivers/radeonsi/radeonsi_pm4.c
index 9a884f7..4ea30f6 100644
--- a/src/gallium/drivers/radeonsi/radeonsi_pm4.c
+++ b/src/gallium/drivers/radeonsi/radeonsi_pm4.c
@@ -137,6 +137,7 @@ void si_pm4_inval_shader_cache(struct si_pm4_state *state)
 void si_pm4_inval_texture_cache(struct si_pm4_state *state)
 {
state-cp_coher_cntl |= S_0085F0_TC_ACTION_ENA(1);
+   state-cp_coher_cntl |= S_0085F0_TCL1_ACTION_ENA(1);
 }
 
 void si_pm4_inval_fb_cache(struct si_pm4_state *state, unsigned nr_cbufs)
-- 
1.7.3.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 4/4] radeonsi: Add compute support v3

2013-03-13 Thread Tom Stellard

From: Tom Stellard thomas.stell...@amd.com

v2:
  - Only dump shaders when env variable is set.

v3:
  - Don't emit VGT registers
---
 src/gallium/drivers/radeon/radeon_llvm_util.c   |2 +-
 src/gallium/drivers/radeon/radeon_llvm_util.h   |2 +
 src/gallium/drivers/radeonsi/Makefile.sources   |1 +
 src/gallium/drivers/radeonsi/radeonsi_compute.c |  234 +++
 src/gallium/drivers/radeonsi/radeonsi_pipe.c|   61 ++-
 src/gallium/drivers/radeonsi/radeonsi_pipe.h|   10 +
 src/gallium/drivers/radeonsi/radeonsi_pm4.c |5 +-
 src/gallium/drivers/radeonsi/radeonsi_pm4.h |2 +
 src/gallium/drivers/radeonsi/radeonsi_shader.c  |   96 ++
 src/gallium/drivers/radeonsi/radeonsi_shader.h  |4 +
 src/gallium/drivers/radeonsi/sid.h  |6 +
 11 files changed, 378 insertions(+), 45 deletions(-)
 create mode 100644 src/gallium/drivers/radeonsi/radeonsi_compute.c

diff --git a/src/gallium/drivers/radeon/radeon_llvm_util.c 
b/src/gallium/drivers/radeon/radeon_llvm_util.c
index b2ecb1a..2582d9c 100644
--- a/src/gallium/drivers/radeon/radeon_llvm_util.c
+++ b/src/gallium/drivers/radeon/radeon_llvm_util.c
@@ -30,7 +30,7 @@
 #include llvm-c/BitReader.h
 #include llvm-c/Core.h
 
-static LLVMModuleRef radeon_llvm_parse_bitcode(const unsigned char * bitcode,
+LLVMModuleRef radeon_llvm_parse_bitcode(const unsigned char * bitcode,
unsigned bitcode_len)
 {
LLVMMemoryBufferRef buf;
diff --git a/src/gallium/drivers/radeon/radeon_llvm_util.h 
b/src/gallium/drivers/radeon/radeon_llvm_util.h
index 7db25bb..b851648 100644
--- a/src/gallium/drivers/radeon/radeon_llvm_util.h
+++ b/src/gallium/drivers/radeon/radeon_llvm_util.h
@@ -29,6 +29,8 @@
 
 #include llvm-c/Core.h
 
+LLVMModuleRef radeon_llvm_parse_bitcode(const unsigned char * bitcode,
+   unsigned bitcode_len);
 unsigned radeon_llvm_get_num_kernels(const unsigned char *bitcode, unsigned 
bitcode_len);
 LLVMModuleRef radeon_llvm_get_kernel_module(unsigned index,
const unsigned char *bitcode, unsigned bitcode_len);
diff --git a/src/gallium/drivers/radeonsi/Makefile.sources 
b/src/gallium/drivers/radeonsi/Makefile.sources
index 65da1ac..5e1cc4f 100644
--- a/src/gallium/drivers/radeonsi/Makefile.sources
+++ b/src/gallium/drivers/radeonsi/Makefile.sources
@@ -9,6 +9,7 @@ C_SOURCES := \
r600_texture.c \
r600_translate.c \
radeonsi_pm4.c \
+   radeonsi_compute.c \
si_state.c \
si_state_streamout.c \
si_state_draw.c \
diff --git a/src/gallium/drivers/radeonsi/radeonsi_compute.c 
b/src/gallium/drivers/radeonsi/radeonsi_compute.c
new file mode 100644
index 000..1e8978c
--- /dev/null
+++ b/src/gallium/drivers/radeonsi/radeonsi_compute.c
@@ -0,0 +1,234 @@
+#include util/u_memory.h
+
+#include radeonsi_pipe.h
+#include radeonsi_shader.h
+
+#include radeon_llvm_util.h
+
+struct si_pipe_compute {
+   struct r600_context *ctx;
+
+   unsigned local_size;
+   unsigned private_size;
+   unsigned input_size;
+   struct si_pipe_shader shader;
+   unsigned num_user_sgprs;
+
+struct si_pm4_state *pm4_buffers;
+
+};
+
+static void *radeonsi_create_compute_state(
+   struct pipe_context *ctx,
+   const struct pipe_compute_state *cso)
+{
+   struct r600_context *rctx = (struct r600_context *)ctx;
+   struct si_pipe_compute *program =
+   CALLOC_STRUCT(si_pipe_compute);
+   const struct pipe_llvm_program_header *header;
+   const unsigned char *code;
+   LLVMModuleRef mod;
+
+   header = cso-prog;
+   code = cso-prog + sizeof(struct pipe_llvm_program_header);
+
+   program-ctx = rctx;
+   program-local_size = cso-req_local_mem;
+   program-private_size = cso-req_private_mem;
+   program-input_size = cso-req_input_mem;
+
+   mod = radeon_llvm_parse_bitcode(code, header-num_bytes);
+   si_compile_llvm(rctx, program-shader, mod);
+
+   return program;
+}
+
+static void radeonsi_bind_compute_state(struct pipe_context *ctx, void *state)
+{
+   struct r600_context *rctx = (struct r600_context*)ctx;
+   rctx-cs_shader_state.program = (struct si_pipe_compute*)state;
+}
+
+static void radeonsi_set_global_binding(
+   struct pipe_context *ctx, unsigned first, unsigned n,
+   struct pipe_resource **resources,
+   uint32_t **handles)
+{
+   unsigned i;
+   struct r600_context *rctx = (struct r600_context*)ctx;
+   struct si_pipe_compute *program = rctx-cs_shader_state.program;
+   struct si_pm4_state *pm4;
+
+   if (!program-pm4_buffers) {
+   program-pm4_buffers = CALLOC_STRUCT(si_pm4_state);
+   }
+   pm4 = program-pm4_buffers;
+   pm4-compute_pkt = true;
+
+   if (!resources) {
+   return;
+   }
+
+   for (i = first; i  first + n; i++) {

Re: [Mesa-dev] [PATCH 1/4] gallium: PIPE_COMPUTE_CAP_IR_TARGET - allow drivers to specify a processor v2

2013-03-13 Thread Alex Deucher

On Wed, Mar 13, 2013 at 2:11 PM, Tom Stellard t...@stellard.net wrote:
 From: Tom Stellard thomas.stell...@amd.com

 This target string now contains four values instead of three.  The old
 processor field (which was really being interpreted as arch) has been split
 into two fields: processor and arch.  This allows drivers to pass a
 more a more detailed description of the hardware to compiler frontends.

 v2:
   - Adapt to libclc changes

for the series:

Reviewed-by: Alex Deucher alexander.deuc...@amd.com


 ---
  src/gallium/docs/source/screen.rst |8 +-
  src/gallium/drivers/r600/r600_llvm.c   |   63 -
  src/gallium/drivers/r600/r600_llvm.h   |2 -
  src/gallium/drivers/r600/r600_pipe.c   |   74 ++-
  src/gallium/drivers/r600/r600_pipe.h   |2 +
  src/gallium/drivers/radeonsi/radeonsi_pipe.c   |   11 +++
  src/gallium/drivers/radeonsi/radeonsi_pipe.h   |1 +
  src/gallium/drivers/radeonsi/radeonsi_shader.c |4 +-
  .../state_trackers/clover/llvm/invocation.cpp  |   18 --
  9 files changed, 104 insertions(+), 79 deletions(-)

 diff --git a/src/gallium/docs/source/screen.rst 
 b/src/gallium/docs/source/screen.rst
 index 68d1a35..10836f1 100644
 --- a/src/gallium/docs/source/screen.rst
 +++ b/src/gallium/docs/source/screen.rst
 @@ -222,10 +222,10 @@ PIPE_COMPUTE_CAP_*
  Compute-specific capabilities. They can be queried using
  pipe_screen::get_compute_param.

 -* ``PIPE_COMPUTE_CAP_IR_TARGET``: A description of the target as a target
 -  triple specification of the form ``processor-manufacturer-os`` that will
 -  be passed on to the compiler.  This CAP is only relevant for drivers
 -  that specify PIPE_SHADER_IR_LLVM for their preferred IR.
 +* ``PIPE_COMPUTE_CAP_IR_TARGET``: A description of the target of the form
 +  ``processor-arch-manufacturer-os`` that will be passed on to the compiler.
 +  This CAP is only relevant for drivers that specify PIPE_SHADER_IR_LLVM for
 +  their preferred IR.
Value type: null-terminated string.
  * ``PIPE_COMPUTE_CAP_GRID_DIMENSION``: Number of supported dimensions
for grid and block coordinates.  Value type: ``uint64_t``.
 diff --git a/src/gallium/drivers/r600/r600_llvm.c 
 b/src/gallium/drivers/r600/r600_llvm.c
 index 042193c..1552ccb 100644
 --- a/src/gallium/drivers/r600/r600_llvm.c
 +++ b/src/gallium/drivers/r600/r600_llvm.c
 @@ -561,69 +561,6 @@ LLVMModuleRef r600_tgsi_llvm(
 return ctx-gallivm.module;
  }

 -const char * r600_llvm_gpu_string(enum radeon_family family)
 -{
 -   const char * gpu_family;
 -
 -   switch (family) {
 -   case CHIP_R600:
 -   case CHIP_RV610:
 -   case CHIP_RV630:
 -   case CHIP_RV620:
 -   case CHIP_RV635:
 -   case CHIP_RV670:
 -   case CHIP_RS780:
 -   case CHIP_RS880:
 -   gpu_family = r600;
 -   break;
 -   case CHIP_RV710:
 -   gpu_family = rv710;
 -   break;
 -   case CHIP_RV730:
 -   gpu_family = rv730;
 -   break;
 -   case CHIP_RV740:
 -   case CHIP_RV770:
 -   gpu_family = rv770;
 -   break;
 -   case CHIP_PALM:
 -   case CHIP_CEDAR:
 -   gpu_family = cedar;
 -   break;
 -   case CHIP_SUMO:
 -   case CHIP_SUMO2:
 -   case CHIP_REDWOOD:
 -   gpu_family = redwood;
 -   break;
 -   case CHIP_JUNIPER:
 -   gpu_family = juniper;
 -   break;
 -   case CHIP_HEMLOCK:
 -   case CHIP_CYPRESS:
 -   gpu_family = cypress;
 -   break;
 -   case CHIP_BARTS:
 -   gpu_family = barts;
 -   break;
 -   case CHIP_TURKS:
 -   gpu_family = turks;
 -   break;
 -   case CHIP_CAICOS:
 -   gpu_family = caicos;
 -   break;
 -   case CHIP_CAYMAN:
 -case CHIP_ARUBA:
 -   gpu_family = cayman;
 -   break;
 -   default:
 -   gpu_family = ;
 -   fprintf(stderr, Chip not supported by r600 llvm 
 -   backend, please file a bug at  PACKAGE_BUGREPORT 
 \n);
 -   break;
 -   }
 -   return gpu_family;
 -}
 -
  unsigned r600_llvm_compile(
 LLVMModuleRef mod,
 unsigned char ** inst_bytes,
 diff --git a/src/gallium/drivers/r600/r600_llvm.h 
 b/src/gallium/drivers/r600/r600_llvm.h
 index 090d909..b5e2af2 100644
 --- a/src/gallium/drivers/r600/r600_llvm.h
 +++ b/src/gallium/drivers/r600/r600_llvm.h
 @@ -15,8 +15,6 @@ LLVMModuleRef r600_tgsi_llvm(
 struct radeon_llvm_context * ctx,
 const struct tgsi_token * tokens);

 -const char * r600_llvm_gpu_string(enum radeon_family family);
 -
  unsigned r600_llvm_compile(
 LLVMModuleRef mod,
 unsigned char ** inst_bytes,
 diff --git a/src/gallium/drivers/r600/r600_pipe.c

Re: [Mesa-dev] [PATCH 2/2] i965: Avoid unnecessary copy when depthstencil workaround invoked by clear.

2013-03-13 Thread Paul Berry

On 12 March 2013 16:33, Eric Anholt e...@anholt.net wrote:

 Paul Berry stereotype...@gmail.com writes:
   void
  -brw_workaround_depthstencil_alignment(struct brw_context *brw)
  +brw_workaround_depthstencil_alignment(struct brw_context *brw,
  +  GLbitfield clear_mask)
   {
  struct intel_context *intel = brw-intel;
  struct gl_context *ctx = intel-ctx;
  @@ -341,10 +343,24 @@ brw_workaround_depthstencil_alignment(struct
 brw_context *brw)
  struct intel_mipmap_tree *stencil_mt =
 get_stencil_miptree(stencil_irb);
  uint32_t tile_x = 0, tile_y = 0, stencil_tile_x = 0, stencil_tile_y
 = 0;
  uint32_t stencil_draw_x = 0, stencil_draw_y = 0;
  +   bool invalidate_depth = clear_mask  GL_DEPTH_BUFFER_BIT;
  +   bool invalidate_stencil = clear_mask  GL_STENCIL_BUFFER_BIT;
 
  if (depth_irb)
 depth_mt = depth_irb-mt;
 
  +   if (depth_irb  invalidate_depth
  +_mesa_is_depthstencil_format(
  +  _mesa_get_format_base_format(depth_mt-format))
  +!depth_mt-stencil_mt) {

 The only _mesa_is_depthstencil_format() returned by
 _mesa_get_format_base_format() is GL_DEPTH_STENCIL, so calling that
 seems kinda overkill.


Good point.  I'll fix that before pushing.  I'll also make a follow-up
patch to fix the function I borrowed this test from
(intel_miptree_create_layout).



 If depth_mt-stencil_mt, then depth_mt-format's base format will not be
 GL_DEPTH_STENCIL.  I'm concerned that you're going to lose the
 depth_mt-stencil_mt contents of a gl-level packed depth/stencil texture
 that's backed by separate stencil.


I think you inverted your logic there (if depth_mt-stencil_mt, then
depth_mt-format's base format *will* be GL_DEPTH_STENCIL).  Am I correct
in inferring that the cases you're worried about are cases like:

- Client creates a GL_DEPTH_STENCIL texture on a platform such as Gen7 that
uses separate stencil
- Client executes a glClear(GL_DEPTH_BIT) on a miplevel that needs the
workaround, expecting stencil data to be preserved

I think you may be right to worry about this--previously I had assumed that
calling intel_renderbuffer_move_to_temp(depth_mt) on a depth/stencil
texture backed by separate stencil would only relocate the depth miptree
layer, and leave the stencil miptree layer alone.  But rereading the code
again, I'm less certain of that.

I'll write a piglit test to exercise exactly this situation, and post a v2
of the patch if necessary.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 61416] Clover doesn't work on a PRIME system when run under X

2013-03-13 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=61416

--- Comment #5 from Mike Lothian m...@fireburn.co.uk ---
Created attachment 76494
  -- https://bugs.freedesktop.org/attachment.cgi?id=76494action=edit
Xorg log old

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 61416] Clover doesn't work on a PRIME system when run under X

2013-03-13 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=61416

Mike Lothian m...@fireburn.co.uk changed:

   What|Removed |Added

  Attachment #75466|0   |1
is obsolete||

--- Comment #6 from Mike Lothian m...@fireburn.co.uk ---
Created attachment 76495
  -- https://bugs.freedesktop.org/attachment.cgi?id=76495action=edit
Xorg log

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] gallium: add TGSI_SEMANTIC_TEXCOORD,PCOORD

2013-03-13 Thread Christoph Bumiller

Second attempt, 2 years ago no one replied or cared ...

We really need to know about these on nvc0 because there are only 8
fixed hardware locations that can be overwritten by sprite coordinates,
and one location that represents gl_PointCoord and unconditionally
returns sprite coordinates.

So far this was solved via a hack, which works since the locations the
state tracker picks aren't dynamic (and likely will never be, to facilitate
ARB_separate_shader_objects), but it still isn't nice to do it this way.

It looks like nv30 was using a hack, too, since it had a check for
Semantic.Index == 9, which is what mesa uses for PointCoord.

Implementing a safe, non-mesa-dependent way without these SEMANTICs would
be jumping through hoops and doing expensive shader recompilations just
because we like to destroy information at the gallium threshold, and that's
unacceptable.

I started to (try) fix up the other drivers, but maybe we just want a CAP
for this instead, since the default solution - if this is TEXCOORD then
treat it as GENERIC with semantic index += MAX_TEXCOORDS - doesn't really
look that nicer either.
E.g. if PIPE_CAP_RESTRICTED_SPRITE_COORDS is advertised, the state tracker
should use the TEXCOORD and PCOORD semantics, otherwise it should just use
GENERICs as before.
---
 src/gallium/auxiliary/draw/draw_pipe_wide_point.c  |   39 
 src/gallium/auxiliary/tgsi/tgsi_dump.c |1 +
 src/gallium/auxiliary/tgsi/tgsi_strings.c  |2 +
 src/gallium/docs/source/cso/rasterizer.rst |2 +-
 src/gallium/docs/source/tgsi.rst   |   23 +-
 src/gallium/drivers/freedreno/freedreno_compiler.c |2 +
 src/gallium/drivers/i915/i915_fpc_translate.c  |2 +
 src/gallium/drivers/i915/i915_state_derived.c  |4 ++
 src/gallium/drivers/llvmpipe/lp_setup_point.c  |   29 ++--
 src/gallium/drivers/nv30/nvfx_fragprog.c   |   39 
 src/gallium/drivers/nv50/nv50_shader_state.c   |8 +--
 src/gallium/drivers/nv50/nv50_surface.c|5 +-
 src/gallium/drivers/nvc0/nvc0_program.c|   37 +--
 src/gallium/drivers/r300/r300_fs.c |2 +
 src/gallium/drivers/r300/r300_shader_semantics.h   |3 +-
 src/gallium/drivers/r300/r300_vs.c |2 +
 src/gallium/drivers/r600/evergreen_state.c |7 ++-
 src/gallium/drivers/r600/r600_shader.c |3 +-
 src/gallium/drivers/r600/r600_state.c  |7 ++-
 src/gallium/drivers/radeonsi/radeonsi_shader.c |1 +
 src/gallium/drivers/radeonsi/si_state.c|2 +-
 src/gallium/drivers/radeonsi/si_state_draw.c   |5 +-
 src/gallium/include/pipe/p_shader_tokens.h |   36 +--
 src/gallium/include/pipe/p_state.h |2 +-
 src/mesa/state_tracker/st_atom_rasterizer.c|6 +--
 src/mesa/state_tracker/st_program.c|   48 +--
 26 files changed, 162 insertions(+), 155 deletions(-)

diff --git a/src/gallium/auxiliary/draw/draw_pipe_wide_point.c 
b/src/gallium/auxiliary/draw/draw_pipe_wide_point.c
index 8e0a117..d4ed0f7 100644
--- a/src/gallium/auxiliary/draw/draw_pipe_wide_point.c
+++ b/src/gallium/auxiliary/draw/draw_pipe_wide_point.c
@@ -233,28 +233,29 @@ widepoint_first_point(struct draw_stage *stage,
 
   wide-num_texcoord_gen = 0;
 
-  /* Loop over fragment shader inputs looking for generic inputs
-   * for which bit 'k' in sprite_coord_enable is set.
+  /* Loop over fragment shader inputs looking for the PCOORD input or
+   * TEXCOORD inputs for which bit 'k' in sprite_coord_enable is set.
*/
   for (i = 0; i  fs-info.num_inputs; i++) {
- if (fs-info.input_semantic_name[i] == TGSI_SEMANTIC_GENERIC) {
-const int generic_index = fs-info.input_semantic_index[i];
-/* Note that sprite_coord enable is a bitfield of
- * PIPE_MAX_SHADER_OUTPUTS bits.
- */
-if (generic_index  PIPE_MAX_SHADER_OUTPUTS 
-(rast-sprite_coord_enable  (1  generic_index))) {
-   /* OK, this generic attribute needs to be replaced with a
-* texcoord (see above).
-*/
-   int slot = draw_alloc_extra_vertex_attrib(draw,
- TGSI_SEMANTIC_GENERIC,
- generic_index);
-
-   /* add this slot to the texcoord-gen list */
-   wide-texcoord_gen_slot[wide-num_texcoord_gen++] = slot;
-}
+ int slot;
+ const unsigned sn = fs-info.input_semantic_name[i];
+ const unsigned si = fs-info.input_semantic_index[i];
+
+ if (sn == TGSI_SEMANTIC_TEXCOORD) {
+/* Note that sprite_coord enable is a bitfield of 8 bits. */
+if (si = 8 || !(rast-sprite_coord_enable  (1  si)))
+

[Mesa-dev] [Bug 61416] Clover doesn't work on a PRIME system when run under X

2013-03-13 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=61416

--- Comment #7 from Mike Lothian m...@fireburn.co.uk ---
Created attachment 76498
  -- https://bugs.freedesktop.org/attachment.cgi?id=76498action=edit
Glibc error

This shows the glibc lined lists issue

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 61416] Clover doesn't authenticate when not run as a privileged user

2013-03-13 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=61416

Mike Lothian m...@fireburn.co.uk changed:

   What|Removed |Added

Summary|Clover doesn't work on a|Clover doesn't authenticate
   |PRIME system when run under |when not run as a
   |X   |privileged user

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 10/12] Get rid of _mesa_vert_result_to_frag_attrib().

2013-03-13 Thread Paul Berry

On 12 March 2013 19:29, Eric Anholt e...@anholt.net wrote:

 Paul Berry stereotype...@gmail.com writes:

  Now that there is no difference between the enums that represent
  vertex outputs and fragment inputs, there's no need for a conversion
  function.  But we still need to be able to detect when a given vertex
  output has no corresponding fragment input.  So it is replaced by a
  new function, _mesa_varying_slot_in_fs(), which tells whether the
  given varying slot exists as an FS input or not.
  ---
   src/mesa/drivers/dri/i965/brw_fs.cpp| 12 -
   src/mesa/drivers/dri/i965/brw_vs_constval.c | 13 --
   src/mesa/main/mtypes.h  | 38
 +
   3 files changed, 27 insertions(+), 36 deletions(-)
 
  diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
 b/src/mesa/drivers/dri/i965/brw_fs.cpp
  index 86f8cbb..ea4a56c 100644
  --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
  +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
  @@ -1265,7 +1265,7 @@ fs_visitor::calculate_urb_setup()
   continue;
 
 if (c-key.vp_outputs_written  BITFIELD64_BIT(i)) {
  - int fp_index =
 _mesa_vert_result_to_frag_attrib((gl_varying_slot) i);
  +bool exists_in_fs =
 _mesa_varying_slot_in_fs((gl_varying_slot) i);

 I'd rather see this call moved into the single usage in the if statement
 below, like has been done elsewhere (now that the function name
 explicitly talks about what's being tested in the if anyway)


Fair enough--I'll fix that.



/* The back color slot is skipped when the front color is
 * also written to.  In addition, some slots can be
  @@ -1273,8 +1273,8 @@ fs_visitor::calculate_urb_setup()
 * fragment shader.  So the register number must always be
 * incremented, mapped or not.
 */
  - if (fp_index = 0)
  -urb_setup[fp_index] = urb_next;
  + if (exists_in_fs)
  +urb_setup[i] = urb_next;
   urb_next++;


   /**
  - * Convert from a gl_varying_slot value for a vertex output to the
  - * corresponding gl_frag_attrib.
  - *
  - * Varying output values which have no corresponding gl_frag_attrib
  - * (VARYING_SLOT_PSIZ, VARYING_SLOT_BFC0, VARYING_SLOT_BFC1, and
  - * VARYING_SLOT_EDGE) are converted to a value of -1.
  + * Determine if the given gl_varying_slot appears in the fragment
 shader.
*/
  -static inline int
  -_mesa_vert_result_to_frag_attrib(gl_varying_slot vert_result)
  +static inline GLboolean
  +_mesa_varying_slot_in_fs(gl_varying_slot slot)
   {
  -   if (vert_result = VARYING_SLOT_TEX7)
  -  return vert_result;
  -   else if (vert_result  VARYING_SLOT_CLIP_DIST0)
  -  return -1;
  -   else if (vert_result = VARYING_SLOT_CLIP_DIST1)
  -  return vert_result;
  -   else if (vert_result  VARYING_SLOT_VAR0)
  -  return -1;
  -   else
  -  return vert_result;
  +   switch (slot) {
  +   case VARYING_SLOT_PSIZ:
  +   case VARYING_SLOT_BFC0:
  +   case VARYING_SLOT_BFC1:
  +   case VARYING_SLOT_EDGE:
  +   case VARYING_SLOT_CLIP_VERTEX:
  +   case VARYING_SLOT_LAYER:
  +  return GL_FALSE;
  +   default:
  +  return GL_TRUE;
  +   }
   }

 I bet the compiler does a big switch statement instead of doing what we
 could do better with bitfields.  Not a blocker, just a potential
 improvement.


Hmm, now I'm curious.

Amazingly enough, gcc with -O2 is actually smart enough to use a bitfield:

_Z24_mesa_varying_slot_in_fs15gl_varying_slot:
.LFB0:
.cfi_startproc
cmpl$20, %edi
ja.L4
movl$1, %edx
movl%edi, %ecx
xorl%eax, %eax
salq%cl, %rdx
testl$1175552, %edx
je.L4
rep
ret
.p2align 4,,10
.p2align 3
.L4:
movl$1, %eax
ret
.cfi_endproc

I'm impressed.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] R600: Factorize code handling Const Read Port limitation

2013-03-13 Thread Vincent Lejeune

---
 lib/Target/R600/AMDILISelDAGToDAG.cpp| 34 ++
 lib/Target/R600/R600InstrInfo.cpp| 54 ++
 lib/Target/R600/R600InstrInfo.h  |  3 ++
 lib/Target/R600/R600MachineScheduler.cpp | 77 
 lib/Target/R600/R600MachineScheduler.h   |  3 +-
 test/CodeGen/R600/kcache-fold-2.ll   | 52 +
 6 files changed, 144 insertions(+), 79 deletions(-)
 create mode 100644 test/CodeGen/R600/kcache-fold-2.ll

diff --git a/lib/Target/R600/AMDILISelDAGToDAG.cpp 
b/lib/Target/R600/AMDILISelDAGToDAG.cpp
index 0c7880d..05a1ea7 100644
--- a/lib/Target/R600/AMDILISelDAGToDAG.cpp
+++ b/lib/Target/R600/AMDILISelDAGToDAG.cpp
@@ -336,6 +336,7 @@ SDNode *AMDGPUDAGToDAGISel::Select(SDNode *N) {
   return Result;
 }
 
+
 bool AMDGPUDAGToDAGISel::FoldOperands(unsigned Opcode,
 const R600InstrInfo *TII, std::vectorSDValue Ops) {
   int OperandIdx[] = {
@@ -365,17 +366,34 @@ bool AMDGPUDAGToDAGISel::FoldOperands(unsigned Opcode,
 SDValue Operand = Ops[OperandIdx[i] - 1];
 switch (Operand.getOpcode()) {
 case AMDGPUISD::CONST_ADDRESS: {
-  if (i == 2)
-break;
   SDValue CstOffset;
-  if (!Operand.getValueType().isVector() 
-  SelectGlobalValueConstantOffset(Operand.getOperand(0), CstOffset)) {
-Ops[OperandIdx[i] - 1] = CurDAG-getRegister(AMDGPU::ALU_CONST, 
MVT::f32);
-Ops[SelIdx[i] - 1] = CstOffset;
-return true;
+  if (Operand.getValueType().isVector() ||
+  !SelectGlobalValueConstantOffset(Operand.getOperand(0), CstOffset))
+break;
+
+  // Gather others constants values
+  std::vectorunsigned Consts;
+  for (unsigned j = 0; j  3; j++) {
+int SrcIdx = OperandIdx[j];
+if (SrcIdx  0)
+  break;
+if (RegisterSDNode *Reg = dyn_castRegisterSDNode(Ops[SrcIdx - 1])) {
+  if (Reg-getReg() == AMDGPU::ALU_CONST) {
+ConstantSDNode *Cst = dyn_castConstantSDNode(Ops[SelIdx[j] - 1]);
+Consts.push_back(Cst-getZExtValue());
+  }
+}
   }
+
+  ConstantSDNode *Cst = dyn_castConstantSDNode(CstOffset);
+  Consts.push_back(Cst-getZExtValue());
+  if (!TII-fitsConstReadLimitations(Consts))
+break;
+
+  Ops[OperandIdx[i] - 1] = CurDAG-getRegister(AMDGPU::ALU_CONST, 
MVT::f32);
+  Ops[SelIdx[i] - 1] = CstOffset;
+  return true;
   }
-  break;
 case ISD::FNEG:
   if (NegIdx[i]  0)
 break;
diff --git a/lib/Target/R600/R600InstrInfo.cpp 
b/lib/Target/R600/R600InstrInfo.cpp
index be3318a..0865098 100644
--- a/lib/Target/R600/R600InstrInfo.cpp
+++ b/lib/Target/R600/R600InstrInfo.cpp
@@ -139,6 +139,60 @@ bool R600InstrInfo::isALUInstr(unsigned Opcode) const {
   (TargetFlags  R600_InstFlag::OP3));
 }
 
+bool
+R600InstrInfo::fitsConstReadLimitations(const std::vectorunsigned Consts)
+const {
+  assert (Consts.size() = 12  Too many operands in instructions group);
+  unsigned Pair1 = 0, Pair2 = 0;
+  for (unsigned i = 0, n = Consts.size(); i  n; ++i) {
+unsigned ReadConstHalf = Consts[i]  2;
+unsigned ReadConstIndex = Consts[i]  (~3);
+unsigned ReadHalfConst = ReadConstIndex | ReadConstHalf;
+if (!Pair1) {
+  Pair1 = ReadHalfConst;
+  continue;
+}
+if (Pair1 == ReadHalfConst)
+  continue;
+if (!Pair2) {
+  Pair2 = ReadHalfConst;
+  continue;
+}
+if (Pair2 != ReadHalfConst)
+  return false;
+  }
+  return true;
+}
+
+bool
+R600InstrInfo::canBundle(const std::vectorMachineInstr * MIs) const {
+  std::vectorunsigned Consts;
+  for (unsigned i = 0, n = MIs.size(); i  n; i++) {
+const MachineInstr *MI = MIs[i];
+
+const R600Operands::Ops OpTable[3][2] = {
+  {R600Operands::SRC0, R600Operands::SRC0_SEL},
+  {R600Operands::SRC1, R600Operands::SRC1_SEL},
+  {R600Operands::SRC2, R600Operands::SRC2_SEL},
+};
+
+if (!isALUInstr(MI-getOpcode()))
+  continue;
+
+for (unsigned j = 0; j  3; j++) {
+  int SrcIdx = getOperandIdx(MI-getOpcode(), OpTable[j][0]);
+  if (SrcIdx  0)
+break;
+  if (MI-getOperand(SrcIdx).getReg() == AMDGPU::ALU_CONST) {
+unsigned Const = MI-getOperand(
+getOperandIdx(MI-getOpcode(), OpTable[j][1])).getImm();
+Consts.push_back(Const);
+  }
+}
+  }
+  return fitsConstReadLimitations(Consts);
+}
+
 DFAPacketizer *R600InstrInfo::CreateTargetScheduleState(const TargetMachine 
*TM,
 const ScheduleDAG *DAG) const {
   const InstrItineraryData *II = TM-getInstrItineraryData();
diff --git a/lib/Target/R600/R600InstrInfo.h b/lib/Target/R600/R600InstrInfo.h
index efe721c..bf9569e 100644
--- a/lib/Target/R600/R600InstrInfo.h
+++ b/lib/Target/R600/R600InstrInfo.h
@@ -53,6 +53,9 @@ namespace llvm {
   /// \returns true if this \p Opcode represents an ALU instruction.
   bool isALUInstr(unsigned Opcode) const;
 
+  bool fitsConstReadLimitations(const std::vectorunsigned) const;
+

Re: [Mesa-dev] [PATCH] gallium: add TGSI_SEMANTIC_TEXCOORD,PCOORD

2013-03-13 Thread Jose Fonseca

- Original Message -
 Second attempt, 2 years ago no one replied or cared ...
 
 We really need to know about these on nvc0 because there are only 8
 fixed hardware locations that can be overwritten by sprite coordinates,
 and one location that represents gl_PointCoord and unconditionally
 returns sprite coordinates.
 
 So far this was solved via a hack, which works since the locations the
 state tracker picks aren't dynamic (and likely will never be, to facilitate
 ARB_separate_shader_objects), but it still isn't nice to do it this way.
 
 It looks like nv30 was using a hack, too, since it had a check for
 Semantic.Index == 9, which is what mesa uses for PointCoord.
 
 Implementing a safe, non-mesa-dependent way without these SEMANTICs would
 be jumping through hoops and doing expensive shader recompilations just
 because we like to destroy information at the gallium threshold, and that's
 unacceptable.
 
 I started to (try) fix up the other drivers, but maybe we just want a CAP
 for this instead, since the default solution - if this is TEXCOORD then
 treat it as GENERIC with semantic index += MAX_TEXCOORDS - doesn't really
 look that nicer either.
 E.g. if PIPE_CAP_RESTRICTED_SPRITE_COORDS is advertised, the state tracker
 should use the TEXCOORD and PCOORD semantics, otherwise it should just use
 GENERICs as before.

Personally I have no objection with this FWIW.

But please append the new TGSI_SEMANTIC_xxx without renumbering the existing 
ones.

Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] R600: Factorize code handling Const Read Port limitation

2013-03-13 Thread Tom Stellard

On Wed, Mar 13, 2013 at 09:12:41PM +0100, Vincent Lejeune wrote:
 ---
  lib/Target/R600/AMDILISelDAGToDAG.cpp| 34 ++
  lib/Target/R600/R600InstrInfo.cpp| 54 ++
  lib/Target/R600/R600InstrInfo.h  |  3 ++
  lib/Target/R600/R600MachineScheduler.cpp | 77 
 
  lib/Target/R600/R600MachineScheduler.h   |  3 +-
  test/CodeGen/R600/kcache-fold-2.ll   | 52 +
  6 files changed, 144 insertions(+), 79 deletions(-)
  create mode 100644 test/CodeGen/R600/kcache-fold-2.ll
 
 diff --git a/lib/Target/R600/AMDILISelDAGToDAG.cpp 
 b/lib/Target/R600/AMDILISelDAGToDAG.cpp
 index 0c7880d..05a1ea7 100644
 --- a/lib/Target/R600/AMDILISelDAGToDAG.cpp
 +++ b/lib/Target/R600/AMDILISelDAGToDAG.cpp
 @@ -336,6 +336,7 @@ SDNode *AMDGPUDAGToDAGISel::Select(SDNode *N) {
return Result;
  }
  
 +

Whitespace
  bool AMDGPUDAGToDAGISel::FoldOperands(unsigned Opcode,
  const R600InstrInfo *TII, std::vectorSDValue Ops) {
int OperandIdx[] = {
 @@ -365,17 +366,34 @@ bool AMDGPUDAGToDAGISel::FoldOperands(unsigned Opcode,
  SDValue Operand = Ops[OperandIdx[i] - 1];
  switch (Operand.getOpcode()) {
  case AMDGPUISD::CONST_ADDRESS: {
 -  if (i == 2)
 -break;
SDValue CstOffset;
 -  if (!Operand.getValueType().isVector() 
 -  SelectGlobalValueConstantOffset(Operand.getOperand(0), CstOffset)) 
 {
 -Ops[OperandIdx[i] - 1] = CurDAG-getRegister(AMDGPU::ALU_CONST, 
 MVT::f32);
 -Ops[SelIdx[i] - 1] = CstOffset;
 -return true;
 +  if (Operand.getValueType().isVector() ||
 +  !SelectGlobalValueConstantOffset(Operand.getOperand(0), CstOffset))
 +break;
 +
 +  // Gather others constants values
 +  std::vectorunsigned Consts;
 +  for (unsigned j = 0; j  3; j++) {
 +int SrcIdx = OperandIdx[j];
 +if (SrcIdx  0)
 +  break;
 +if (RegisterSDNode *Reg = dyn_castRegisterSDNode(Ops[SrcIdx - 1])) 
 {
 +  if (Reg-getReg() == AMDGPU::ALU_CONST) {
 +ConstantSDNode *Cst = dyn_castConstantSDNode(Ops[SelIdx[j] - 
 1]);
 +Consts.push_back(Cst-getZExtValue());
 +  }
 +}
}
 +
 +  ConstantSDNode *Cst = dyn_castConstantSDNode(CstOffset);
 +  Consts.push_back(Cst-getZExtValue());
 +  if (!TII-fitsConstReadLimitations(Consts))
 +break;
 +
 +  Ops[OperandIdx[i] - 1] = CurDAG-getRegister(AMDGPU::ALU_CONST, 
 MVT::f32);
 +  Ops[SelIdx[i] - 1] = CstOffset;
 +  return true;
}
 -  break;
  case ISD::FNEG:
if (NegIdx[i]  0)
  break;
 diff --git a/lib/Target/R600/R600InstrInfo.cpp 
 b/lib/Target/R600/R600InstrInfo.cpp
 index be3318a..0865098 100644
 --- a/lib/Target/R600/R600InstrInfo.cpp
 +++ b/lib/Target/R600/R600InstrInfo.cpp
 @@ -139,6 +139,60 @@ bool R600InstrInfo::isALUInstr(unsigned Opcode) const {
(TargetFlags  R600_InstFlag::OP3));
  }
  
 +bool
 +R600InstrInfo::fitsConstReadLimitations(const std::vectorunsigned Consts)
 +const {
 +  assert (Consts.size() = 12  Too many operands in instructions group);
 +  unsigned Pair1 = 0, Pair2 = 0;
 +  for (unsigned i = 0, n = Consts.size(); i  n; ++i) {
 +unsigned ReadConstHalf = Consts[i]  2;
 +unsigned ReadConstIndex = Consts[i]  (~3);
 +unsigned ReadHalfConst = ReadConstIndex | ReadConstHalf;
 +if (!Pair1) {
 +  Pair1 = ReadHalfConst;
 +  continue;
 +}
 +if (Pair1 == ReadHalfConst)
 +  continue;
 +if (!Pair2) {
 +  Pair2 = ReadHalfConst;
 +  continue;
 +}
 +if (Pair2 != ReadHalfConst)
 +  return false;
 +  }
 +  return true;
 +}
 +
 +bool
 +R600InstrInfo::canBundle(const std::vectorMachineInstr * MIs) const {
 +  std::vectorunsigned Consts;
 +  for (unsigned i = 0, n = MIs.size(); i  n; i++) {
 +const MachineInstr *MI = MIs[i];
 +
 +const R600Operands::Ops OpTable[3][2] = {
 +  {R600Operands::SRC0, R600Operands::SRC0_SEL},
 +  {R600Operands::SRC1, R600Operands::SRC1_SEL},
 +  {R600Operands::SRC2, R600Operands::SRC2_SEL},
 +};
 +
 +if (!isALUInstr(MI-getOpcode()))
 +  continue;
 +
 +for (unsigned j = 0; j  3; j++) {
 +  int SrcIdx = getOperandIdx(MI-getOpcode(), OpTable[j][0]);
 +  if (SrcIdx  0)
 +break;
 +  if (MI-getOperand(SrcIdx).getReg() == AMDGPU::ALU_CONST) {
 +unsigned Const = MI-getOperand(
 +getOperandIdx(MI-getOpcode(), OpTable[j][1])).getImm();
 +Consts.push_back(Const);
 +  }
 +}
 +  }
 +  return fitsConstReadLimitations(Consts);
 +}
 +
  DFAPacketizer *R600InstrInfo::CreateTargetScheduleState(const TargetMachine 
 *TM,
  const ScheduleDAG *DAG) const {
const InstrItineraryData *II = TM-getInstrItineraryData();
 diff --git a/lib/Target/R600/R600InstrInfo.h b/lib/Target/R600/R600InstrInfo.h
 index efe721c..bf9569e 100644
 --- a/lib/Target/R600/R600InstrInfo.h
 +++

[Mesa-dev] [PATCH 1/3] softpipe: don't assert when creating surfaces with multiple layers

2013-03-13 Thread sroland

From: Roland Scheidegger srol...@vmware.com

We can't handle them yet, however we can safely just warn (we will
just render to first layer, which is fine since we can't handle
rendertarget system value neither).
Also make behavior more predictable with buffer surfaces
(it would sometimes hit bogus asserts because of the union in the surface,
instead create the surface but assert when trying to set a buffer
in the framebuffer).
---
 src/gallium/drivers/softpipe/sp_texture.c|   30 +-
 src/gallium/drivers/softpipe/sp_tile_cache.c |   18 ++--
 2 files changed, 32 insertions(+), 16 deletions(-)

diff --git a/src/gallium/drivers/softpipe/sp_texture.c 
b/src/gallium/drivers/softpipe/sp_texture.c
index 0d1481a..2db0de8 100644
--- a/src/gallium/drivers/softpipe/sp_texture.c
+++ b/src/gallium/drivers/softpipe/sp_texture.c
@@ -283,10 +283,6 @@ softpipe_create_surface(struct pipe_context *pipe,
 const struct pipe_surface *surf_tmpl)
 {
struct pipe_surface *ps;
-   unsigned level = surf_tmpl-u.tex.level;
-
-   assert(level = pt-last_level);
-   assert(surf_tmpl-u.tex.first_layer == surf_tmpl-u.tex.last_layer);
 
ps = CALLOC_STRUCT(pipe_surface);
if (ps) {
@@ -294,12 +290,26 @@ softpipe_create_surface(struct pipe_context *pipe,
   pipe_resource_reference(ps-texture, pt);
   ps-context = pipe;
   ps-format = surf_tmpl-format;
-  ps-width = u_minify(pt-width0, level);
-  ps-height = u_minify(pt-height0, level);
-
-  ps-u.tex.level = level;
-  ps-u.tex.first_layer = surf_tmpl-u.tex.first_layer;
-  ps-u.tex.last_layer = surf_tmpl-u.tex.last_layer;
+  if (pt-target != PIPE_BUFFER) {
+ assert(surf_tmpl-u.tex.level = pt-last_level);
+ ps-width = u_minify(pt-width0, surf_tmpl-u.tex.level);
+ ps-height = u_minify(pt-height0, surf_tmpl-u.tex.level);
+ ps-u.tex.level = surf_tmpl-u.tex.level;
+ ps-u.tex.first_layer = surf_tmpl-u.tex.first_layer;
+ ps-u.tex.last_layer = surf_tmpl-u.tex.last_layer;
+ if (ps-u.tex.first_layer != ps-u.tex.last_layer) {
+debug_printf(creating surface with multiple layers, rendering to 
first layer only\n);
+ }
+  }
+  else {
+ /* setting width as number of elements should get us correct 
renderbuffer width */
+ ps-width = surf_tmpl-u.buf.last_element - 
surf_tmpl-u.buf.first_element + 1;
+ ps-height = pt-height0;
+ ps-u.buf.first_element = surf_tmpl-u.buf.first_element;
+ ps-u.buf.last_element = surf_tmpl-u.buf.last_element;
+ assert(ps-u.buf.first_element = ps-u.buf.last_element);
+ assert(ps-u.buf.last_element  ps-width);
+  }
}
return ps;
 }
diff --git a/src/gallium/drivers/softpipe/sp_tile_cache.c 
b/src/gallium/drivers/softpipe/sp_tile_cache.c
index dded0e1..b6dd6af 100644
--- a/src/gallium/drivers/softpipe/sp_tile_cache.c
+++ b/src/gallium/drivers/softpipe/sp_tile_cache.c
@@ -170,12 +170,18 @@ sp_tile_cache_set_surface(struct softpipe_tile_cache *tc,
tc-surface = ps;
 
if (ps) {
-  tc-transfer_map = pipe_transfer_map(pipe, ps-texture,
-   ps-u.tex.level, 
ps-u.tex.first_layer,
-   PIPE_TRANSFER_READ_WRITE |
-   PIPE_TRANSFER_UNSYNCHRONIZED,
-   0, 0, ps-width, ps-height,
-   tc-transfer);
+  if (ps-texture-target != PIPE_BUFFER) {
+ tc-transfer_map = pipe_transfer_map(pipe, ps-texture,
+  ps-u.tex.level, 
ps-u.tex.first_layer,
+  PIPE_TRANSFER_READ_WRITE |
+  PIPE_TRANSFER_UNSYNCHRONIZED,
+  0, 0, ps-width, ps-height,
+  tc-transfer);
+  }
+  else {
+ /* can't render to buffers */
+ assert(0);
+  }
 
   tc-depth_stencil = util_format_is_depth_or_stencil(ps-format);
}
-- 
1.7.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/3] llvmpipe: don't assert when trying to render to surfaces with multiple layers

2013-03-13 Thread sroland

From: Roland Scheidegger srol...@vmware.com

instead just warn when creating the surface, rendering will simply happen
to first layer.
---
 src/gallium/drivers/llvmpipe/lp_scene.c   |2 --
 src/gallium/drivers/llvmpipe/lp_texture.c |3 +++
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/llvmpipe/lp_scene.c 
b/src/gallium/drivers/llvmpipe/lp_scene.c
index a0912eb..a888586 100644
--- a/src/gallium/drivers/llvmpipe/lp_scene.c
+++ b/src/gallium/drivers/llvmpipe/lp_scene.c
@@ -157,7 +157,6 @@ lp_scene_begin_rasterization(struct lp_scene *scene)
for (i = 0; i  scene-fb.nr_cbufs; i++) {
   struct pipe_surface *cbuf = scene-fb.cbufs[i];
   if (llvmpipe_resource_is_texture(cbuf-texture)) {
- assert(cbuf-u.tex.first_layer == cbuf-u.tex.last_layer);
  scene-cbufs[i].stride = llvmpipe_resource_stride(cbuf-texture,
cbuf-u.tex.level);
 
@@ -178,7 +177,6 @@ lp_scene_begin_rasterization(struct lp_scene *scene)
 
if (fb-zsbuf) {
   struct pipe_surface *zsbuf = scene-fb.zsbuf;
-  assert(zsbuf-u.tex.first_layer == zsbuf-u.tex.last_layer);
   scene-zsbuf.stride = llvmpipe_resource_stride(zsbuf-texture, 
zsbuf-u.tex.level);
   scene-zsbuf.blocksize = 
  util_format_get_blocksize(zsbuf-texture-format);
diff --git a/src/gallium/drivers/llvmpipe/lp_texture.c 
b/src/gallium/drivers/llvmpipe/lp_texture.c
index 9de05e7..99bd6d3 100644
--- a/src/gallium/drivers/llvmpipe/lp_texture.c
+++ b/src/gallium/drivers/llvmpipe/lp_texture.c
@@ -593,6 +593,9 @@ llvmpipe_create_surface(struct pipe_context *pipe,
  ps-u.tex.level = surf_tmpl-u.tex.level;
  ps-u.tex.first_layer = surf_tmpl-u.tex.first_layer;
  ps-u.tex.last_layer = surf_tmpl-u.tex.last_layer;
+ if (ps-u.tex.first_layer != ps-u.tex.last_layer) {
+debug_printf(creating surface with multiple layers, rendering to 
first layer only\n);
+ }
   }
   else {
  /* setting width as number of elements should get us correct 
renderbuffer width */
-- 
1.7.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/3] tgsi: fix sample_d emit for arrays

2013-03-13 Thread sroland

From: Roland Scheidegger srol...@vmware.com

Those cases were apparently forgotten.
---
 src/gallium/auxiliary/tgsi/tgsi_exec.c |   30 +++---
 1 file changed, 11 insertions(+), 19 deletions(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.c 
b/src/gallium/auxiliary/tgsi/tgsi_exec.c
index 4488397..3df3ac3 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_exec.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_exec.c
@@ -2371,50 +2371,42 @@ exec_sample_d(struct tgsi_exec_machine *mach,
/* always fetch all 3 offsets, overkill but keeps code simple */
fetch_texel_offsets(mach, inst, offsets);
 
+   FETCH(r[0], 0, TGSI_CHAN_X);
+
switch (mach-SamplerViews[resource_unit].Resource) {
case TGSI_TEXTURE_1D:
-  FETCH(r[0], 0, TGSI_CHAN_X);
+   case TGSI_TEXTURE_1D_ARRAY:
+  /* only 1D array actually needs Y */
+  FETCH(r[1], 0, TGSI_CHAN_Y);
 
   fetch_assign_deriv_channel(mach, inst, 3, TGSI_CHAN_X, derivs[0]);
 
   fetch_texel(mach-Sampler, resource_unit, sampler_unit,
-  r[0], ZeroVec, ZeroVec, ZeroVec, ZeroVec,   /* S, T, P, 
C, LOD */
+  r[0], r[1], ZeroVec, ZeroVec, ZeroVec,   /* S, T, P, C, 
LOD */
   derivs, offsets, tgsi_sampler_derivs_explicit,
   r[0], r[1], r[2], r[3]);   /* R, G, B, A */
   break;
 
case TGSI_TEXTURE_2D:
case TGSI_TEXTURE_RECT:
-  FETCH(r[0], 0, TGSI_CHAN_X);
+   case TGSI_TEXTURE_2D_ARRAY:
+  /* only 2D array actually needs Z */
   FETCH(r[1], 0, TGSI_CHAN_Y);
+  FETCH(r[2], 0, TGSI_CHAN_Z);
 
   fetch_assign_deriv_channel(mach, inst, 3, TGSI_CHAN_X, derivs[0]);
   fetch_assign_deriv_channel(mach, inst, 3, TGSI_CHAN_Y, derivs[1]);
 
   fetch_texel(mach-Sampler, resource_unit, sampler_unit,
-  r[0], r[1], ZeroVec, ZeroVec, ZeroVec,   /* inputs */
+  r[0], r[1], r[2], ZeroVec, ZeroVec,   /* inputs */
   derivs, offsets, tgsi_sampler_derivs_explicit,
   r[0], r[1], r[2], r[3]); /* outputs */
   break;
 
case TGSI_TEXTURE_3D:
case TGSI_TEXTURE_CUBE:
-  FETCH(r[0], 0, TGSI_CHAN_X);
-  FETCH(r[1], 0, TGSI_CHAN_Y);
-  FETCH(r[2], 0, TGSI_CHAN_Z);
-
-  fetch_assign_deriv_channel(mach, inst, 3, TGSI_CHAN_X, derivs[0]);
-  fetch_assign_deriv_channel(mach, inst, 3, TGSI_CHAN_Y, derivs[1]);
-  fetch_assign_deriv_channel(mach, inst, 3, TGSI_CHAN_Z, derivs[2]);
-
-  fetch_texel(mach-Sampler, resource_unit, sampler_unit,
-  r[0], r[1], r[2], ZeroVec, ZeroVec,
-  derivs, offsets, tgsi_sampler_derivs_explicit,
-  r[0], r[1], r[2], r[3]);
-  break;
-
case TGSI_TEXTURE_CUBE_ARRAY:
-  FETCH(r[0], 0, TGSI_CHAN_X);
+  /* only cube array actually needs W */
   FETCH(r[1], 0, TGSI_CHAN_Y);
   FETCH(r[2], 0, TGSI_CHAN_Z);
   FETCH(r[3], 0, TGSI_CHAN_W);
-- 
1.7.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] i965: Simplify separate stencil check

2013-03-13 Thread Paul Berry

The only format returned by _mesa_get_format_base_format() that
satisfies _mesa_is_depthstencil_format() is GL_DEPTH_STENCIL, so we
can simplify the check.
---
 src/mesa/drivers/dri/intel/intel_mipmap_tree.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c
index 1640590..a47f6d8 100644
--- a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c
@@ -247,7 +247,7 @@ intel_miptree_create_layout(struct intel_context *intel,
mt-physical_depth0 = depth0;
 
if (!for_region 
-   _mesa_is_depthstencil_format(_mesa_get_format_base_format(format)) 
+   _mesa_get_format_base_format(format) == GL_DEPTH_STENCIL 
(intel-must_use_separate_stencil ||
(intel-has_separate_stencil 
 intel-vtbl.is_hiz_depth_format(intel, format {
-- 
1.8.1.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] R600: Lower clamp constant to constant

2013-03-13 Thread Vincent Lejeune

---
 lib/Target/R600/R600ISelLowering.cpp | 23 +++
 test/CodeGen/R600/clamp-constants.ll | 20 
 2 files changed, 43 insertions(+)
 create mode 100644 test/CodeGen/R600/clamp-constants.ll

diff --git a/lib/Target/R600/R600ISelLowering.cpp 
b/lib/Target/R600/R600ISelLowering.cpp
index a73691d..96686e6 100644
--- a/lib/Target/R600/R600ISelLowering.cpp
+++ b/lib/Target/R600/R600ISelLowering.cpp
@@ -394,6 +394,29 @@ SDValue R600TargetLowering::LowerOperation(SDValue Op, 
SelectionDAG DAG) const
 
   return SDValue(interp, slot % 2);
 }
+case AMDGPUIntrinsic::AMDIL_clamp: {
+  ConstantFPSDNode *Min = dyn_castConstantFPSDNode(Op.getOperand(2));
+  ConstantFPSDNode *Max = dyn_castConstantFPSDNode(Op.getOperand(3));
+  if (ConstantFPSDNode *C = dyn_castConstantFPSDNode(Op.getOperand(1))) {
+switch (C-getValueAPF().compare(Max-getValueAPF())) {
+case APFloat::cmpGreaterThan:
+case APFloat::cmpEqual:
+  return Op.getOperand(3);
+default:
+  break;
+}
+
+switch (C-getValueAPF().compare(Min-getValueAPF())) {
+case APFloat::cmpLessThan:
+case APFloat::cmpEqual:
+  return Op.getOperand(2);
+default:
+  break;
+}
+return Op.getOperand(1);
+  }
+  break;
+}
 
 case r600_read_ngroups_x:
   return LowerImplicitParameter(DAG, VT, DL, 0);
diff --git a/test/CodeGen/R600/clamp-constants.ll 
b/test/CodeGen/R600/clamp-constants.ll
new file mode 100644
index 000..cf4d35f
--- /dev/null
+++ b/test/CodeGen/R600/clamp-constants.ll
@@ -0,0 +1,20 @@
+;RUN: llc  %s -march=r600 -mcpu=redwood | FileCheck %s
+
+;CHECK-NOT: MOV
+
+define void @main() {
+main_body:
+  %0 = call float @llvm.AMDIL.clamp.(float 1.50e+00, float 0.00e+00, 
float 1.00e+00)
+  %1 = call float @llvm.AMDIL.clamp.(float 0.00e+00, float 0.00e+00, 
float 1.00e+00)
+  %2 = call float @llvm.AMDIL.clamp.(float 1.00e+00, float 0.00e+00, 
float 1.00e+00)
+  %3 = call float @llvm.AMDIL.clamp.(float -0.50e+00, float 0.00e+00, 
float 1.00e+00)
+  %4 = insertelement 4 x float undef, float %0, i32 0
+  %5 = insertelement 4 x float %4, float %1, i32 1
+  %6 = insertelement 4 x float %5, float %2, i32 2
+  %7 = insertelement 4 x float %6, float %3, i32 3
+  call void @llvm.R600.store.swizzle(4 x float %7, i32 0, i32 0)
+  ret void
+}
+
+declare float @llvm.AMDIL.clamp.(float, float, float) readnone
+declare void @llvm.R600.store.swizzle(4 x float, i32, i32)
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] R600: Lower clamp constant to constant

2013-03-13 Thread Tom Stellard

On Wed, Mar 13, 2013 at 10:26:38PM +0100, Vincent Lejeune wrote:
 ---
  lib/Target/R600/R600ISelLowering.cpp | 23 +++
  test/CodeGen/R600/clamp-constants.ll | 20 
  2 files changed, 43 insertions(+)
  create mode 100644 test/CodeGen/R600/clamp-constants.ll


I like this idea, but I think a better solution would be to replace
llvm.AMDIL.clamp with LLVM IR in the frontend and add a pattern for
clamp in the backend.  This way the LLVM optimizers would handle the
constant folding for us, and we would also be able to optimize open-coded
clamps.

-Tom

 diff --git a/lib/Target/R600/R600ISelLowering.cpp 
 b/lib/Target/R600/R600ISelLowering.cpp
 index a73691d..96686e6 100644
 --- a/lib/Target/R600/R600ISelLowering.cpp
 +++ b/lib/Target/R600/R600ISelLowering.cpp
 @@ -394,6 +394,29 @@ SDValue R600TargetLowering::LowerOperation(SDValue Op, 
 SelectionDAG DAG) const
  
return SDValue(interp, slot % 2);
  }
 +case AMDGPUIntrinsic::AMDIL_clamp: {
 +  ConstantFPSDNode *Min = dyn_castConstantFPSDNode(Op.getOperand(2));
 +  ConstantFPSDNode *Max = dyn_castConstantFPSDNode(Op.getOperand(3));
 +  if (ConstantFPSDNode *C = 
 dyn_castConstantFPSDNode(Op.getOperand(1))) {
 +switch (C-getValueAPF().compare(Max-getValueAPF())) {
 +case APFloat::cmpGreaterThan:
 +case APFloat::cmpEqual:
 +  return Op.getOperand(3);
 +default:
 +  break;
 +}
 +
 +switch (C-getValueAPF().compare(Min-getValueAPF())) {
 +case APFloat::cmpLessThan:
 +case APFloat::cmpEqual:
 +  return Op.getOperand(2);
 +default:
 +  break;
 +}
 +return Op.getOperand(1);
 +  }
 +  break;
 +}
  
  case r600_read_ngroups_x:
return LowerImplicitParameter(DAG, VT, DL, 0);
 diff --git a/test/CodeGen/R600/clamp-constants.ll 
 b/test/CodeGen/R600/clamp-constants.ll
 new file mode 100644
 index 000..cf4d35f
 --- /dev/null
 +++ b/test/CodeGen/R600/clamp-constants.ll
 @@ -0,0 +1,20 @@
 +;RUN: llc  %s -march=r600 -mcpu=redwood | FileCheck %s
 +
 +;CHECK-NOT: MOV
 +
 +define void @main() {
 +main_body:
 +  %0 = call float @llvm.AMDIL.clamp.(float 1.50e+00, float 0.00e+00, 
 float 1.00e+00)
 +  %1 = call float @llvm.AMDIL.clamp.(float 0.00e+00, float 0.00e+00, 
 float 1.00e+00)
 +  %2 = call float @llvm.AMDIL.clamp.(float 1.00e+00, float 0.00e+00, 
 float 1.00e+00)
 +  %3 = call float @llvm.AMDIL.clamp.(float -0.50e+00, float 
 0.00e+00, float 1.00e+00)
 +  %4 = insertelement 4 x float undef, float %0, i32 0
 +  %5 = insertelement 4 x float %4, float %1, i32 1
 +  %6 = insertelement 4 x float %5, float %2, i32 2
 +  %7 = insertelement 4 x float %6, float %3, i32 3
 +  call void @llvm.R600.store.swizzle(4 x float %7, i32 0, i32 0)
 +  ret void
 +}
 +
 +declare float @llvm.AMDIL.clamp.(float, float, float) readnone
 +declare void @llvm.R600.store.swizzle(4 x float, i32, i32)
 -- 
 1.8.1.4
 
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] R600: Factorize code handling Const Read Port limitation

2013-03-13 Thread Vincent Lejeune

I fixed the coding style issue.
The iostream include was a debug leftover line, it shouldn't be there.


- Mail original -
 De : Tom Stellard t...@stellard.net
 À : Vincent Lejeune v...@ovi.com
 Cc : llvm-comm...@cs.uiuc.edu; mesa-dev@lists.freedesktop.org
 Envoyé le : Mercredi 13 mars 2013 21h49
 Objet : Re: [PATCH] R600: Factorize code handling Const Read Port limitation
 
 On Wed, Mar 13, 2013 at 09:12:41PM +0100, Vincent Lejeune wrote:
  ---
   lib/Target/R600/AMDILISelDAGToDAG.cpp    | 34 ++
   lib/Target/R600/R600InstrInfo.cpp        | 54 ++
   lib/Target/R600/R600InstrInfo.h          |  3 ++
   lib/Target/R600/R600MachineScheduler.cpp | 77 
 
   lib/Target/R600/R600MachineScheduler.h   |  3 +-
   test/CodeGen/R600/kcache-fold-2.ll       | 52 +
   6 files changed, 144 insertions(+), 79 deletions(-)
   create mode 100644 test/CodeGen/R600/kcache-fold-2.ll
 
  diff --git a/lib/Target/R600/AMDILISelDAGToDAG.cpp 
 b/lib/Target/R600/AMDILISelDAGToDAG.cpp
  index 0c7880d..05a1ea7 100644
  --- a/lib/Target/R600/AMDILISelDAGToDAG.cpp
  +++ b/lib/Target/R600/AMDILISelDAGToDAG.cpp
  @@ -336,6 +336,7 @@ SDNode *AMDGPUDAGToDAGISel::Select(SDNode *N) {
     return Result;
   }
   
  +
 
 Whitespace
   bool AMDGPUDAGToDAGISel::FoldOperands(unsigned Opcode,
       const R600InstrInfo *TII, std::vectorSDValue Ops) {
     int OperandIdx[] = {
  @@ -365,17 +366,34 @@ bool AMDGPUDAGToDAGISel::FoldOperands(unsigned 
 Opcode,
       SDValue Operand = Ops[OperandIdx[i] - 1];
       switch (Operand.getOpcode()) {
       case AMDGPUISD::CONST_ADDRESS: {
  -      if (i == 2)
  -        break;
         SDValue CstOffset;
  -      if (!Operand.getValueType().isVector() 
  -          SelectGlobalValueConstantOffset(Operand.getOperand(0), 
 CstOffset)) {
  -        Ops[OperandIdx[i] - 1] = CurDAG-getRegister(AMDGPU::ALU_CONST, 
 MVT::f32);
  -        Ops[SelIdx[i] - 1] = CstOffset;
  -        return true;
  +      if (Operand.getValueType().isVector() ||
  +          !SelectGlobalValueConstantOffset(Operand.getOperand(0), 
 CstOffset))
  +        break;
  +
  +      // Gather others constants values
  +      std::vectorunsigned Consts;
  +      for (unsigned j = 0; j  3; j++) {
  +        int SrcIdx = OperandIdx[j];
  +        if (SrcIdx  0)
  +          break;
  +        if (RegisterSDNode *Reg = 
 dyn_castRegisterSDNode(Ops[SrcIdx - 1])) {
  +          if (Reg-getReg() == AMDGPU::ALU_CONST) {
  +            ConstantSDNode *Cst = 
 dyn_castConstantSDNode(Ops[SelIdx[j] - 1]);
  +            Consts.push_back(Cst-getZExtValue());
  +          }
  +        }
         }
  +
  +      ConstantSDNode *Cst = dyn_castConstantSDNode(CstOffset);
  +      Consts.push_back(Cst-getZExtValue());
  +      if (!TII-fitsConstReadLimitations(Consts))
  +        break;
  +
  +      Ops[OperandIdx[i] - 1] = CurDAG-getRegister(AMDGPU::ALU_CONST, 
 MVT::f32);
  +      Ops[SelIdx[i] - 1] = CstOffset;
  +      return true;
         }
  -      break;
       case ISD::FNEG:
         if (NegIdx[i]  0)
           break;
  diff --git a/lib/Target/R600/R600InstrInfo.cpp 
 b/lib/Target/R600/R600InstrInfo.cpp
  index be3318a..0865098 100644
  --- a/lib/Target/R600/R600InstrInfo.cpp
  +++ b/lib/Target/R600/R600InstrInfo.cpp
  @@ -139,6 +139,60 @@ bool R600InstrInfo::isALUInstr(unsigned Opcode) const 
 {
             (TargetFlags  R600_InstFlag::OP3));
   }
   
  +bool
  +R600InstrInfo::fitsConstReadLimitations(const std::vectorunsigned 
 Consts)
  +    const {
  +  assert (Consts.size() = 12  Too many operands in 
 instructions group);
  +  unsigned Pair1 = 0, Pair2 = 0;
  +  for (unsigned i = 0, n = Consts.size(); i  n; ++i) {
  +    unsigned ReadConstHalf = Consts[i]  2;
  +    unsigned ReadConstIndex = Consts[i]  (~3);
  +    unsigned ReadHalfConst = ReadConstIndex | ReadConstHalf;
  +    if (!Pair1) {
  +      Pair1 = ReadHalfConst;
  +      continue;
  +    }
  +    if (Pair1 == ReadHalfConst)
  +      continue;
  +    if (!Pair2) {
  +      Pair2 = ReadHalfConst;
  +      continue;
  +    }
  +    if (Pair2 != ReadHalfConst)
  +      return false;
  +  }
  +  return true;
  +}
  +
  +bool
  +R600InstrInfo::canBundle(const std::vectorMachineInstr * MIs) 
 const {
  +  std::vectorunsigned Consts;
  +  for (unsigned i = 0, n = MIs.size(); i  n; i++) {
  +    const MachineInstr *MI = MIs[i];
  +
  +    const R600Operands::Ops OpTable[3][2] = {
  +      {R600Operands::SRC0, R600Operands::SRC0_SEL},
  +      {R600Operands::SRC1, R600Operands::SRC1_SEL},
  +      {R600Operands::SRC2, R600Operands::SRC2_SEL},
  +    };
  +
  +    if (!isALUInstr(MI-getOpcode()))
  +      continue;
  +
  +    for (unsigned j = 0; j  3; j++) {
  +      int SrcIdx = getOperandIdx(MI-getOpcode(), OpTable[j][0]);
  +      if (SrcIdx  0)
  +        break;
  +      if (MI-getOperand(SrcIdx).getReg() == AMDGPU::ALU_CONST) {
  +        unsigned Const = MI-getOperand(
  +

Re: [Mesa-dev] [PATCH] gallium: add TGSI_SEMANTIC_TEXCOORD,PCOORD

2013-03-13 Thread Alex Deucher

On Wed, Mar 13, 2013 at 3:51 PM, Christoph Bumiller
e0425...@student.tuwien.ac.at wrote:
 Second attempt, 2 years ago no one replied or cared ...

 We really need to know about these on nvc0 because there are only 8
 fixed hardware locations that can be overwritten by sprite coordinates,
 and one location that represents gl_PointCoord and unconditionally
 returns sprite coordinates.

 So far this was solved via a hack, which works since the locations the
 state tracker picks aren't dynamic (and likely will never be, to facilitate
 ARB_separate_shader_objects), but it still isn't nice to do it this way.

 It looks like nv30 was using a hack, too, since it had a check for
 Semantic.Index == 9, which is what mesa uses for PointCoord.

 Implementing a safe, non-mesa-dependent way without these SEMANTICs would
 be jumping through hoops and doing expensive shader recompilations just
 because we like to destroy information at the gallium threshold, and that's
 unacceptable.

 I started to (try) fix up the other drivers, but maybe we just want a CAP
 for this instead, since the default solution - if this is TEXCOORD then
 treat it as GENERIC with semantic index += MAX_TEXCOORDS - doesn't really
 look that nicer either.
 E.g. if PIPE_CAP_RESTRICTED_SPRITE_COORDS is advertised, the state tracker
 should use the TEXCOORD and PCOORD semantics, otherwise it should just use
 GENERICs as before.
 ---
  src/gallium/auxiliary/draw/draw_pipe_wide_point.c  |   39 
  src/gallium/auxiliary/tgsi/tgsi_dump.c |1 +
  src/gallium/auxiliary/tgsi/tgsi_strings.c  |2 +
  src/gallium/docs/source/cso/rasterizer.rst |2 +-
  src/gallium/docs/source/tgsi.rst   |   23 +-
  src/gallium/drivers/freedreno/freedreno_compiler.c |2 +
  src/gallium/drivers/i915/i915_fpc_translate.c  |2 +
  src/gallium/drivers/i915/i915_state_derived.c  |4 ++
  src/gallium/drivers/llvmpipe/lp_setup_point.c  |   29 ++--
  src/gallium/drivers/nv30/nvfx_fragprog.c   |   39 
  src/gallium/drivers/nv50/nv50_shader_state.c   |8 +--
  src/gallium/drivers/nv50/nv50_surface.c|5 +-
  src/gallium/drivers/nvc0/nvc0_program.c|   37 +--
  src/gallium/drivers/r300/r300_fs.c |2 +
  src/gallium/drivers/r300/r300_shader_semantics.h   |3 +-
  src/gallium/drivers/r300/r300_vs.c |2 +
  src/gallium/drivers/r600/evergreen_state.c |7 ++-
  src/gallium/drivers/r600/r600_shader.c |3 +-
  src/gallium/drivers/r600/r600_state.c  |7 ++-
  src/gallium/drivers/radeonsi/radeonsi_shader.c |1 +
  src/gallium/drivers/radeonsi/si_state.c|2 +-
  src/gallium/drivers/radeonsi/si_state_draw.c   |5 +-
  src/gallium/include/pipe/p_shader_tokens.h |   36 +--
  src/gallium/include/pipe/p_state.h |2 +-
  src/mesa/state_tracker/st_atom_rasterizer.c|6 +--
  src/mesa/state_tracker/st_program.c|   48 +--
  26 files changed, 162 insertions(+), 155 deletions(-)

 diff --git a/src/gallium/auxiliary/draw/draw_pipe_wide_point.c 
 b/src/gallium/auxiliary/draw/draw_pipe_wide_point.c
 index 8e0a117..d4ed0f7 100644
 --- a/src/gallium/auxiliary/draw/draw_pipe_wide_point.c
 +++ b/src/gallium/auxiliary/draw/draw_pipe_wide_point.c
 @@ -233,28 +233,29 @@ widepoint_first_point(struct draw_stage *stage,

wide-num_texcoord_gen = 0;

 -  /* Loop over fragment shader inputs looking for generic inputs
 -   * for which bit 'k' in sprite_coord_enable is set.
 +  /* Loop over fragment shader inputs looking for the PCOORD input or
 +   * TEXCOORD inputs for which bit 'k' in sprite_coord_enable is set.
 */
for (i = 0; i  fs-info.num_inputs; i++) {
 - if (fs-info.input_semantic_name[i] == TGSI_SEMANTIC_GENERIC) {
 -const int generic_index = fs-info.input_semantic_index[i];
 -/* Note that sprite_coord enable is a bitfield of
 - * PIPE_MAX_SHADER_OUTPUTS bits.
 - */
 -if (generic_index  PIPE_MAX_SHADER_OUTPUTS 
 -(rast-sprite_coord_enable  (1  generic_index))) {
 -   /* OK, this generic attribute needs to be replaced with a
 -* texcoord (see above).
 -*/
 -   int slot = draw_alloc_extra_vertex_attrib(draw,
 - 
 TGSI_SEMANTIC_GENERIC,
 - generic_index);
 -
 -   /* add this slot to the texcoord-gen list */
 -   wide-texcoord_gen_slot[wide-num_texcoord_gen++] = slot;
 -}
 + int slot;
 + const unsigned sn = fs-info.input_semantic_name[i];
 + const unsigned si = fs-info.input_semantic_index[i];
 +
 + if (sn ==

Re: [Mesa-dev] OSMesa VTK

2013-03-13 Thread Brian Paul


On 03/13/2013 12:08 PM, Kevin H. Hobbs wrote:

On 03/13/2013 10:40 AM, Brian Paul wrote:


Can you tell me what the parameters are for your OSMesaCreateContext()
and OSMesaMakeCurrent() calls (in particular the image format/type)?


I do not know. I'll do some digging into the VTK source and ask on the
VTK list.



You could also try setting GALLIUM_DRIVER=softpipe and see if the
softpipe driver works.




The test still fails, however it fails differently, instead of a black
image I get a strip of colored snow at the top of the output image...

How about I just show you?

The output (boring black image) of last night's VTK LoadOpenGLExtension
test which used llvmpipe is here :

http://open.cdash.org/testDetails.php?test=180871306build=2844006

The output of:

$ env GALLIUM_DRIVER=softpipe \
   /home/kevin/kitware/VTK_OSMesa_Build/bin/vtkRenderingOpenGLCxxTests \
   LoadOpenGLExtension -D /home/kevin/kitware/VTKData \
   -T /home/kevin/kitware/VTK_OSMesa_Build/Testing/Temporary \
   -V Baseline/Rendering/LoadOpenGLExtension.png

is here :

  http://crab-lab.zool.ohiou.edu/kevin/LoadOpenGLExtension.png



Could I get a binary of one of your test programs?  Linux 64-bit?

-Brian

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] OSMesa VTK

2013-03-13 Thread Kevin H. Hobbs

Sure:
http://crab-lab.zool.ohiou.edu/kevin/vtkRenderingOpenGLCxxTests
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] [libclc] configure: Enable building separate libraries for target variants

2013-03-13 Thread Aaron Watry

The python changes in this file look good to me. I haven't done a
line-by-line review of the SI changes.

I tested this patch and v2 of the related mesa series on r600g (radeon
6850) with a recent LLVM and fresh mesa master as of this evening. No real
change in the piglit CL test success/failure rate.

Do you have any interest in trying to merge your changes to date back into
the upstream libclc codebase?  If you think it's a good idea, but don't
have time to do it yourself, let me know and I'll try to re-base the series
of patches.

--Aaron


On Tue, Mar 12, 2013 at 3:20 PM, Tom Stellard t...@stellard.net wrote:

 From: Tom Stellard thomas.stell...@amd.com

 ---
  configure.py |  119
 -
  1 files changed, 75 insertions(+), 44 deletions(-)

 diff --git a/configure.py b/configure.py
 index d861c24..dfd9a8f 100755
 --- a/configure.py
 +++ b/configure.py
 @@ -68,6 +68,15 @@ llvm_clang = os.path.join(llvm_bindir, 'clang')
  llvm_link = os.path.join(llvm_bindir, 'llvm-link')
  llvm_opt = os.path.join(llvm_bindir, 'opt')

 +available_targets = {
 +  'r600--' : { 'devices' :
 +   [{'gpu' : 'cedar',   'aliases' : ['palm', 'sumo', 'sumo2',
 'redwood', 'juniper']},
 +{'gpu' : 'cypress', 'aliases' : ['hemlock']},
 +{'gpu' : 'barts',   'aliases' : ['turks', 'caicos']},
 +{'gpu' : 'cayman',  'aliases' : ['aruba']},
 +{'gpu' : 'tahiti',  'aliases' : ['pitcairn', 'verde',
 'oland']}]}
 +}
 +
  default_targets = ['r600--']

  targets = args
 @@ -127,50 +136,72 @@ for target in targets:

clang_cl_includes = ' '.join([-I%s % incdir for incdir in incdirs])

 -  # The rule for building a .bc file for the specified architecture using
 clang.
 -  clang_bc_flags = -target %s -I`dirname $in` %s  \
 -   -Dcl_clang_storage_class_specifiers  \
 -   -Dcl_khr_fp64  \
 -   -emit-llvm % (target, clang_cl_includes)
 -  clang_bc_rule = CLANG_CL_BC_ + target
 -  c_compiler_rule(b, clang_bc_rule, LLVM-CC, llvm_clang, clang_bc_flags)
 -
 -  objects = []
 -  sources_seen = set()
 -
 -  for libdir in libdirs:
 -subdir_list_file = os.path.join(libdir, 'SOURCES')
 -manifest_deps.add(subdir_list_file)
 -override_list_file = os.path.join(libdir, 'OVERRIDES')
 -
 -# Add target overrides
 -if os.path.exists(override_list_file):
 -  for override in open(override_list_file).readlines():
 -override = override.rstrip()
 -sources_seen.add(override)
 -
 -for src in open(subdir_list_file).readlines():
 -  src = src.rstrip()
 -  if src not in sources_seen:
 -sources_seen.add(src)
 -obj = os.path.join(target, 'lib', src + '.bc')
 -objects.append(obj)
 -src_file = os.path.join(libdir, src)
 -ext = os.path.splitext(src)[1]
 -if ext == '.ll':
 -  b.build(obj, 'LLVM_AS', src_file)
 -else:
 -  b.build(obj, clang_bc_rule, src_file)
 -
 -  builtins_link_bc = os.path.join(target, 'lib', 'builtins.link.bc')
 -  builtins_opt_bc = os.path.join(target, 'lib', 'builtins.opt.bc')
 -  builtins_bc = os.path.join('built_libs', target + '.bc')
 -  b.build(builtins_link_bc, LLVM_LINK, objects)
 -  b.build(builtins_opt_bc, OPT, builtins_link_bc)
 -  b.build(builtins_bc, PREPARE_BUILTINS, builtins_opt_bc,
 prepare_builtins)
 -  install_files_bc.append((builtins_bc, builtins_bc))
 -  install_deps.append(builtins_bc)
 -  b.default(builtins_bc)
 +  for device in available_targets[target]['devices']:
 +# The rule for building a .bc file for the specified architecture
 using clang.
 +clang_bc_flags = -target %s -I`dirname $in` %s  \
 + -Dcl_clang_storage_class_specifiers  \
 + -Dcl_khr_fp64  \
 + -emit-llvm % (target, clang_cl_includes)
 +if device['gpu'] != '':
 +  clang_bc_flags += ' -mcpu=' + device['gpu']
 +clang_bc_rule = CLANG_CL_BC_ + target
 +c_compiler_rule(b, clang_bc_rule, LLVM-CC, llvm_clang,
 clang_bc_flags)
 +
 +objects = []
 +sources_seen = set()
 +
 +if device['gpu'] == '':
 +  full_target_name = target
 +  obj_suffix = ''
 +else:
 +  full_target_name = device['gpu'] + '-' + target
 +  obj_suffix = '.' + device['gpu']
 +
 +for libdir in libdirs:
 +  subdir_list_file = os.path.join(libdir, 'SOURCES')
 +  manifest_deps.add(subdir_list_file)
 +  override_list_file = os.path.join(libdir, 'OVERRIDES')
 +
 +  # Add target overrides
 +  if os.path.exists(override_list_file):
 +for override in open(override_list_file).readlines():
 +  override = override.rstrip()
 +  sources_seen.add(override)
 +
 +  for src in open(subdir_list_file).readlines():
 +src = src.rstrip()
 +# Only add the base filename (e.g. Add get_global_id instead of
 +# get_global_id.cl) to sources_seen.
 +#

Re: [Mesa-dev] [PATCH] i965: Split shader_time entries into separate cachelines.

2013-03-13 Thread Kenneth Graunke


On 03/13/2013 10:45 AM, Eric Anholt wrote:

This avoids some snooping overhead between EUs processing separate shaders
(so VS versus FS).


Plausible!


Improves performance of a minecraft trace with shader_time by 28.9% +/-
18.3% (n=7), and performance of my old GLSL demo by 93.7% +/- 0.8% (n=4).


+/- 18.3%...lol.  Still, nice improvement.  This should make the tool 
much nicer to use.


Reviewed-by: Kenneth Graunke kenn...@whitecape.org

In case I forgot, all shader_time patches you've sent so far get a R-b. 
 I remember reading them, just not whether I acked them.



v2: Add a define for the stride with a comment explaining its units and
 why.
---
  src/mesa/drivers/dri/i965/brw_context.h |8 
  src/mesa/drivers/dri/i965/brw_fs.cpp|2 +-
  src/mesa/drivers/dri/i965/brw_program.c |5 +++--
  src/mesa/drivers/dri/i965/brw_vec4.cpp  |2 +-
  4 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index c34d6b1..d042dd6 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -571,6 +571,14 @@ struct brw_vs_prog_data {
  #define SURF_INDEX_SOL_BINDING(t)((t))
  #define BRW_MAX_GS_SURFACES  
SURF_INDEX_SOL_BINDING(BRW_MAX_SOL_BINDINGS)

+/**
+ * Stride in bytes between shader_time entries.
+ *
+ * We separate entries by a cacheline to reduce traffic between EUs writing to
+ * different entries.
+ */
+#define SHADER_TIME_STRIDE 64
+
  enum brw_cache_id {
 BRW_BLEND_STATE,
 BRW_DEPTH_STENCIL_STATE,
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 8ce3954..8476bb5 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -621,7 +621,7 @@ fs_visitor::emit_shader_time_write(enum 
shader_time_shader_type type,

 fs_reg offset_mrf = fs_reg(MRF, base_mrf);
 offset_mrf.type = BRW_REGISTER_TYPE_UD;
-   emit(MOV(offset_mrf, fs_reg(shader_time_index * 4)));
+   emit(MOV(offset_mrf, fs_reg(shader_time_index * SHADER_TIME_STRIDE)));

 fs_reg time_mrf = fs_reg(MRF, base_mrf + 1);
 time_mrf.type = BRW_REGISTER_TYPE_UD;
diff --git a/src/mesa/drivers/dri/i965/brw_program.c 
b/src/mesa/drivers/dri/i965/brw_program.c
index 75eb6bc..62954d3 100644
--- a/src/mesa/drivers/dri/i965/brw_program.c
+++ b/src/mesa/drivers/dri/i965/brw_program.c
@@ -228,7 +228,8 @@ brw_init_shader_time(struct brw_context *brw)

 const int max_entries = 4096;
 brw-shader_time.bo = drm_intel_bo_alloc(intel-bufmgr, shader time,
-max_entries * 4, 4096);
+max_entries * SHADER_TIME_STRIDE,
+4096);
 brw-shader_time.programs = rzalloc_array(brw, struct gl_shader_program *,
   max_entries);
 brw-shader_time.types = rzalloc_array(brw, enum shader_time_shader_type,
@@ -409,7 +410,7 @@ brw_collect_shader_time(struct brw_context *brw)
 uint32_t *times = brw-shader_time.bo-virtual;

 for (int i = 0; i  brw-shader_time.num_entries; i++) {
-  brw-shader_time.cumulative[i] += times[i];
+  brw-shader_time.cumulative[i] += times[i * SHADER_TIME_STRIDE / 4];
 }

 /* Zero the BO out to clear it out for our next collection.
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index f319f32..d759710 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -1225,7 +1225,7 @@ vec4_visitor::emit_shader_time_write(enum 
shader_time_shader_type type,

 dst_reg offset_mrf = dst_reg(MRF, base_mrf);
 offset_mrf.type = BRW_REGISTER_TYPE_UD;
-   emit(MOV(offset_mrf, src_reg(shader_time_index * 4)));
+   emit(MOV(offset_mrf, src_reg(shader_time_index * SHADER_TIME_STRIDE)));

 dst_reg time_mrf = dst_reg(MRF, base_mrf + 1);
 time_mrf.type = BRW_REGISTER_TYPE_UD;


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 62319] New: libtxc: missing m4 folder preventing build

2013-03-13 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=62319

  Priority: medium
Bug ID: 62319
  Assignee: mesa-dev@lists.freedesktop.org
   Summary: libtxc: missing m4 folder preventing build
  Severity: blocker
Classification: Unclassified
OS: All
  Reporter: alexandre.f.dem...@gmail.com
  Hardware: All
Status: NEW
   Version: git
 Component: Other
   Product: Mesa

When building from libtxc's latest git repository, I receive the following
error:
couldn't open directory 'm4': No such file or directory

Everything was working fine until the switch to autotool. Manually creating the
m4 directory then launching ./autogen.sh works fine. If the m4 directory is
missing, it fails.

Is it possible we are missing an m4 directory in libtxc's git tree?

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 62319] libtxc: missing m4 folder preventing build

2013-03-13 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=62319

--- Comment #1 from Alexandre Demers alexandre.f.dem...@gmail.com ---
Also, I think there should be no m4 in the .gitignore file (I'm supposing
this from mesa's .gitignore file)

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 62319] libtxc: missing m4 folder preventing build

2013-03-13 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=62319

Alexandre Demers alexandre.f.dem...@gmail.com changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|mesa-dev@lists.freedesktop. |alexandre.f.dem...@gmail.co
   |org |m

--- Comment #2 from Alexandre Demers alexandre.f.dem...@gmail.com ---
Created attachment 76508
  -- https://bugs.freedesktop.org/attachment.cgi?id=76508action=edit
Fixes m4 error

removed m4 from .gitignore
added m4 folder
added m4/.gitignore

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

58 matches

Mail list logo