[Mesa-dev] [PATCH] llvmpipe: handle offset_clamp

2013-06-26 Thread sroland
From: Roland Scheidegger srol...@vmware.com This was just ignored (unless for some reason like unfilled polys draw was handling this). I'm not convinced of that code, putting the float for the clamp in the key isn't really a good idea. Then again the other floats for depth bias are already in

[Mesa-dev] [PATCH 1/2] llvmpipe: add support for nested / overlapping queries

2013-06-25 Thread sroland
From: Roland Scheidegger srol...@vmware.com OpenGL doesn't support this but d3d10 does. It is a bit of a pain as it is necessary to keep track of queries still active at the end of a scene, which is also why I cheat a bit and limit the amount of simultaneously active queries to (arbitrary) 16

[Mesa-dev] [PATCH 2/2] softpipe: honor predication for clear_render_target and clear_depth_stencil

2013-06-25 Thread sroland
From: Roland Scheidegger srol...@vmware.com trivial, copied from llvmpipe --- src/gallium/drivers/softpipe/sp_surface.c | 42 +++-- 1 file changed, 40 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/softpipe/sp_surface.c

[Mesa-dev] [PATCH] llvmpipe: rework query logic

2013-06-24 Thread sroland
From: Roland Scheidegger srol...@vmware.com Previously lp_rast_begin_query commands were always inserted into each bin, and re-issued if the scene was restarted, while lp_rast_end_query commands were executed for each still active query at the end of tile rasterization. Also, the ps_invocations

[Mesa-dev] [PATCH] llvmpipe: fix wrong results for queries not in a scene

2013-06-20 Thread sroland
From: Roland Scheidegger srol...@vmware.com The result isn't always 0 in this case (depends on query type), so instead of special casing this just use the ordinary path (should result in correct values thanks to initialization in query_begin/end), just skipping the fence wait. ---

[Mesa-dev] [PATCH 1/2] gallium: fix PIPE_QUERY_TIMESTAMP_DISJOINT

2013-06-18 Thread sroland
From: Roland Scheidegger srol...@vmware.com The semantics didn't really make sense, not really matching neither d3d9 (though the docs are all broken there) nor d3d10. So make it match d3d10 semantics, which actually gives meaning to the disjoint part. Drivers are fixed up in a very primitive way,

[Mesa-dev] [PATCH 2/2] softpipe: handle all queries, and change for the new disjoint semantics

2013-06-18 Thread sroland
From: Roland Scheidegger srol...@vmware.com The driver can do render_condition but wasn't handling the occlusion and so_overflow predicates (though the latter might not work yet due to gs support). --- src/gallium/drivers/softpipe/sp_query.c | 39 ++- 1 file

[Mesa-dev] [PATCH] llvmpipe: handle more queries

2013-06-18 Thread sroland
From: Roland Scheidegger srol...@vmware.com Handle PIPE_QUERY_GPU_FINISHED and PIPE_QUERY_TIMESTAMP_DISJOINT, and also fill out the ps_invocations and c_primitives from the PIPE_QUERY_PIPELINE_STATISTICS (the others in there should already be handled). Note that ps_invocations isn't pixel exact,

[Mesa-dev] [PATCH] gallium: add condition parameter to render_condition

2013-06-14 Thread sroland
From: Roland Scheidegger srol...@vmware.com For conditional rendering this makes it possible to skip rendering if either the predicate is true or false, as supported by d3d10 (in fact previously it was sort of implied skip rendering if predicate is false for occlusion predicate, and true for

[Mesa-dev] [PATCH] llvmpipe: fixes for conditional rendering

2013-06-14 Thread sroland
From: Roland Scheidegger srol...@vmware.com honor render_condition for clear_render_target and clear_depth_stencil. Also add minimal support for occlusion predicate, though it can't be active at the same time as an occlusion query yet. While here also switchify some large if-else (actually just

[Mesa-dev] [PATCH] util: new util_fill_box helper

2013-06-08 Thread sroland
From: Roland Scheidegger srol...@vmware.com Use new util_fill_box helper for util_clear_render_target. (Also fix off-by-one map error.) --- src/gallium/auxiliary/util/u_surface.c | 39 +-- src/gallium/auxiliary/util/u_surface.h |7 +

[Mesa-dev] [PATCH] gallium/docs: fix up transfer description for 1d arrays, add cube map arrays

2013-06-06 Thread sroland
From: Roland Scheidegger srol...@vmware.com Transfers always use z/depth for layers no matter if it's a 1d or 2d array texture, we don't follow OpenGL's crazyness there. Luckily this appears to only be a doc bug, everyone doing the right thing already. While here also document z/depth parameter

[Mesa-dev] [PATCH 1/2] llvmpipe: move create_surface/destroy_surface functions to lp_surface.c

2013-06-06 Thread sroland
From: Roland Scheidegger srol...@vmware.com Believe it or not but these two are actually the first two functions which really belong in this file nowadays. --- src/gallium/drivers/llvmpipe/lp_surface.c | 59 - src/gallium/drivers/llvmpipe/lp_texture.c | 59

[Mesa-dev] [PATCH 2/2] util: fix util_clear_render_target and util_clear_depth_stencil layer handling

2013-06-06 Thread sroland
From: Roland Scheidegger srol...@vmware.com These functions must clear all bound layers, not just the first. --- src/gallium/auxiliary/util/u_surface.c | 190 +-- src/gallium/auxiliary/util/u_transfer.c |1 + 2 files changed, 104 insertions(+), 87 deletions(-)

[Mesa-dev] [PATCH 1/3] gallium/tgsi: add missing string for layer semantic

2013-06-05 Thread sroland
From: Roland Scheidegger srol...@vmware.com Also report if a shader writes the layer semantic --- src/gallium/auxiliary/draw/draw_context.c |2 +- src/gallium/auxiliary/tgsi/tgsi_scan.c|5 + src/gallium/auxiliary/tgsi/tgsi_scan.h|1 +

[Mesa-dev] [PATCH 2/3] llvmpipe: add support for layered rendering

2013-06-05 Thread sroland
From: Roland Scheidegger srol...@vmware.com Mostly just make sure the layer parameter gets passed through to the right places (and get clamped, can do this at setup time), fix up clears to clear all layers and disable opaque optimization. Luckily don't need to touch the jitted code. (Clears

[Mesa-dev] [PATCH 3/3] llvmpipe: bump 3d and cube map limits to 2048 and 8192 respectively

2013-06-05 Thread sroland
From: Roland Scheidegger srol...@vmware.com These should just work (?), required by d3d10. Too large resources will get thrown out separately anyway. --- src/gallium/drivers/llvmpipe/lp_limits.h |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git

[Mesa-dev] [PATCH] gallivm: work around slow code generated for interleaving 128bit vectors

2013-06-04 Thread sroland
From: Roland Scheidegger srol...@vmware.com We use 128bit vector interleave for untwiddling in the blend code (with 256bit vectors). llvm generates terrible code for this for some reason, so instead of generating a shuffle for 2 128bit vectors use a extract/insert shuffle instead (it only seems

[Mesa-dev] [PATCH 1/4] gallivm: (trivial) fix lp_build_concat_n

2013-06-03 Thread sroland
From: Roland Scheidegger srol...@vmware.com The code was designed to handle no-op concat but failed (unless the caller was using same pointer for src and dst). --- src/gallium/auxiliary/gallivm/lp_bld_pack.c |6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git

[Mesa-dev] [PATCH 2/4] gallivm: enhance special sse2 4x4f and 2x8f - 1x16ub conversion

2013-06-03 Thread sroland
From: Roland Scheidegger srol...@vmware.com There's no good reason why it can't handle 2x4f-1x8ub, 1x4f-1x4ub and 1x8f-1x8ub cases, there might be legitimate reasons why we don't have enough input vectors for a full destination vector, and using pack intrinsics should still be much better than

[Mesa-dev] [PATCH 3/4] llvmpipe: cleanup of generate_unswizzled_blend

2013-06-03 Thread sroland
From: Roland Scheidegger srol...@vmware.com Some parameters were used inconsistently, for instance not using block_width/block_height/block_size for deferring number of pixels but rather relying on guesses from the number of fragment shaders etc, so fix this up (no actual change in behavior since

[Mesa-dev] [PATCH 4/4] llvmpipe: reduce alignment requirement for 1d resources from 4x4 to 4x1

2013-06-03 Thread sroland
From: Roland Scheidegger srol...@vmware.com For rendering to buffers, we cannot have any y alignment. So make sure that tile clear commands only clear up to the fb width/height, not more (do this for all resources actually as clearing more seems pointless for other resources too). For the jit fs

[Mesa-dev] [PATCH] llvmpipe: reduce alignment requirement for 1d resources from 4x4 to 4x1

2013-05-31 Thread sroland
From: Roland Scheidegger srol...@vmware.com For rendering to buffers, we cannot have any y alignment. So make sure that tile clear commands only clear up to the fb width/height, not more (do this for all resources actually as clearing more seems pointless for other resources too). For the jit fs

[Mesa-dev] [PATCH] llvmpipe: fix bogus assertions for buffer surfaces

2013-05-31 Thread sroland
From: Roland Scheidegger srol...@vmware.com One of the assertion made no sense for buffer rendertargets (due to the union), so drop it. (The same assertion is present already in the path for texture surfaces later.). --- src/gallium/drivers/llvmpipe/lp_texture.c |4 ++-- 1 file changed, 2

[Mesa-dev] [PATCH] gallium: add support for layered rendering

2013-05-31 Thread sroland
From: Roland Scheidegger srol...@vmware.com Since pipe_surface already has all the necessary fields no interface changes are necessary except adding a new shader semantic value (TGSI_SEMANTIC_LAYER), though add a pipe capability bit for it as well. (Note that what GL knows as gl_Layer variable

[Mesa-dev] [PATCH] gallivm: fix out-of-bounds access with mirror_clamp_to_edge address mode

2013-05-31 Thread sroland
From: Roland Scheidegger srol...@vmware.com Surprising this bug survived so long, we were missing a clamp (in the linear filtering version). (Valgrind complained a lot about invalid reads with piglit texwrap, I've also seen spurios failures in this test which might have happened due to this.

[Mesa-dev] [PATCH] llvmpipe: get rid of tiled/linear layout remains

2013-05-28 Thread sroland
From: Roland Scheidegger srol...@vmware.com Eliminate the rest of the no longer needed layout logic. (It is possible some code could be simplified a bit further still.) --- src/gallium/drivers/llvmpipe/lp_scene.c |6 +- src/gallium/drivers/llvmpipe/lp_setup.c |6 +-

[Mesa-dev] [PATCH] llvmpipe: reduce alignment requirement for resources from 64x64 to 4x4

2013-05-28 Thread sroland
From: Roland Scheidegger srol...@vmware.com The overallocation was very bad especially for things like 1d array textures which got blown up by a factor of 64. (Even ordinary smallish 2d textures benefit a lot from this, a mipmapped 64x64 rgba8 texture previously used 7*16kB = 112kB instead of now

[Mesa-dev] [PATCH 1/5] llvmpipe: fix bug in early depth test / late depth write handling

2013-05-21 Thread sroland
From: Roland Scheidegger srol...@vmware.com Using wrong type if the format was less than 32bits. No piglit changes as it doesn't hit that path. --- src/gallium/drivers/llvmpipe/lp_bld_depth.c | 13 +++-- 1 file changed, 7 insertions(+), 6 deletions(-) diff --git

[Mesa-dev] [PATCH 2/5] llvmpipe: (trivial) remove confusing code in stencil test

2013-05-21 Thread sroland
From: Roland Scheidegger srol...@vmware.com This was meant to disable some code which isn't needed when depth/stencil isn't written. However, there's more code which wouldn't be needed in that case so having the condition there was just odd (llvm will drop all the code anyway). ---

[Mesa-dev] [PATCH 3/5] llvmpipe: fix issue with not writing new stencil values

2013-05-21 Thread sroland
From: Roland Scheidegger srol...@vmware.com We did mask checks between depth/stencil testing and depth/stencil write. This meant that if the depth/stencil test killed off all fragments we never actually wrote the new stencil value. This issue affected all early/late test/write combinations. So

[Mesa-dev] [PATCH 4/5] llvmpipe: fix early depth test / late depth write stencil issues

2013-05-21 Thread sroland
From: Roland Scheidegger srol...@vmware.com We actually did early depth/stencil test and late depth/stencil write even when the shader could kill the fragment (alpha test or discard). Since it matters for the new stencil value if the fragment is killed by depth/stencil test or by the shader (in

[Mesa-dev] [PATCH 5/5] llvmpipe: disable simple_shader optimization

2013-05-21 Thread sroland
From: Roland Scheidegger srol...@vmware.com This optimization disabled mask checks if the shader is simple enough. While this should work correctly, the problem is that it can hide real issues because shaders in practice are usually complex enough (8 instructions or 1 texture is already enough)

[Mesa-dev] [PATCH 1/2] softpipe: disambiguate TILE_SIZE / TEX_TILE_SIZE

2013-05-21 Thread sroland
From: Roland Scheidegger srol...@vmware.com These can be different (just like NUM_TEX_TILE_ENTRIES / NUM_ENTRIES), though currently they aren't. --- src/gallium/drivers/softpipe/sp_tex_sample.c | 28 +++--- src/gallium/drivers/softpipe/sp_tex_tile_cache.c | 28

[Mesa-dev] [PATCH 2/2] softpipe: change TEX_TILE_SIZE and NUM_TEX_TILE_ENTRIES

2013-05-21 Thread sroland
From: Roland Scheidegger srol...@vmware.com Initially we had NUM_TEX_TILE_ENTRIES of 50, however this was using too much memory (mostly because the tile cache is operating on fixed max current sampler views which could be fixed but that's another topic). So it was decreased to 4. However this is

[Mesa-dev] [PATCH] llvmpipe: fix stencil issues

2013-05-17 Thread sroland
From: Roland Scheidegger srol...@vmware.com Two (somewhat related) issues: 1) We did mask checks between depth/stencil testing and depth/stencil write. This meant that if the depth/stencil test killed off all fragments we never actually wrote the new stencil value. This issue affected all

[Mesa-dev] [PATCH 1/3] gallivm: handle z32s8x24 format for sampling

2013-05-16 Thread sroland
From: Roland Scheidegger srol...@vmware.com Since we can only sample either depth or stencil but not both only load the required bits which makes things a bit easier (it requires special handling since the format doesn't fit into 32bit). The logic for deciding if depth or stencil should be

[Mesa-dev] [PATCH 2/3] llvmpipe: handle z32s8x24 depth/stencil format

2013-05-16 Thread sroland
From: Roland Scheidegger srol...@vmware.com We need to split up the depth and stencil values in this case, and there's some new logic required to handle float depth and stencil simultaneously. Also make sure we get the 64bit zs clear values and masks propagated correctly. ---

[Mesa-dev] [PATCH 3/3] llvmpipe: enable z32s8x24 format

2013-05-16 Thread sroland
From: Roland Scheidegger srol...@vmware.com Now that we can handle it both for sampling and as depth/stencil enable it. Passes nearly all additional piglit tests which are now performed, with two exceptions (one being a framebuffer blit which fails for all other formats including stencil too as

[Mesa-dev] [PATCH] llvmpipe: get rid of unused tiled/linear logic

2013-05-16 Thread sroland
From: Roland Scheidegger srol...@vmware.com We do rendering to linear color buffers for quite some time, and since switching to linear depth buffers all the tiled/linear logic was unused. So get rid of (most) of it - there's still some LAYOUT_NONE things and late allocation of resources which

[Mesa-dev] [PATCH] llvmpipe: fix bogus handling of first_layer when setting up texture sampling

2013-05-16 Thread sroland
From: Roland Scheidegger srol...@vmware.com The code avoided first_layer parameter in the sampler interface (and needing to do another calculation at runtime) by fixing up the base texture pointer instead. Unfortunately, this didn't actually work as we have mip-first texture layout so fixing up

[Mesa-dev] [PATCH] st/mesa: fix weird UCMP opcode use for bool ubo load

2013-05-08 Thread sroland
From: Roland Scheidegger srol...@vmware.com I don't know what this code was trying to do but whatever it was it couldn't have worked since negation of integer boolean inputs while not specified as outright illegal (not yet at least) won't do anything since it doesn't affect the result of

[Mesa-dev] [PATCH] gallium/tgsi: clarify (possibly change) TGSI_OPCODE_UCMP definition

2013-05-07 Thread sroland
From: Roland Scheidegger srol...@vmware.com UCMP while an integer opcode isn't really consistently implemented as having all integer arguments. softpipe will assume all arguments are ints, whereas gallivm has the arguments defined as untyped which means they'll get treated as floats. This means

[Mesa-dev] [PATCH] gallium: more tgsi documentation updates

2013-05-03 Thread sroland
From: Roland Scheidegger srol...@vmware.com Adds the remaining integer opcodes, and some opcodes are moved to more appropriate places, along with getting rid of the (already nearly empty) ps_2_x section. Though the CAP bits for some of these are still a bit in the air so the documentation isn't

[Mesa-dev] [PATCH] gallium: tgsi documentation updates and clarification for integer opcodes.

2013-05-02 Thread sroland
From: Roland Scheidegger srol...@vmware.com A lot of them were missing. Others were moved from the Compute ISA to a new Integer ISA section as that seemed more appropriate. --- src/gallium/docs/source/tgsi.rst | 362 ++ 1 file changed, 289 insertions(+), 73

[Mesa-dev] [PATCH] llvmpipe: get rid of depth swizzling.

2013-04-26 Thread sroland
From: Roland Scheidegger srol...@vmware.com Eliminating this we no longer need to copy between linear and swizzled layout. This is probably not quite ideal since it's a bit more work for now, could do some optimizations by moving depth testing outside the fragment shader loop (but tricky for

[Mesa-dev] [PATCH] llvmpipe: get rid of depth swizzling.

2013-04-26 Thread sroland
From: Roland Scheidegger srol...@vmware.com Eliminating this we no longer need to copy between linear and swizzled layout. This is probably not quite ideal since it's a bit more work for now, could do some optimizations by moving depth testing outside the fragment shader loop (but tricky for

[Mesa-dev] [PATCH 1/5] gallivm: increase nesting limit to 66

2013-04-18 Thread sroland
From: Roland Scheidegger srol...@vmware.com This is still not really correct, since at least for sm 4.0 the nesting limit is 64 per subroutine, and subroutine nesting itself has a limit of 32, so since we have a flat stack we'd need 32*64. But this should probably be better fixed with

[Mesa-dev] [PATCH 2/5] gallium: document breakc and switch/case/default/endswitch

2013-04-18 Thread sroland
From: Roland Scheidegger srol...@vmware.com docs were missing, especially the opcode-from-hell switch however is anything but obvious. --- src/gallium/docs/source/tgsi.rst | 57 ++ 1 file changed, 51 insertions(+), 6 deletions(-) diff --git

[Mesa-dev] [PATCH 3/5] gallivm/tgsi: fix up breakc

2013-04-18 Thread sroland
From: Roland Scheidegger srol...@vmware.com It seems there was a typo in gallivm breakc handling (I am actually still not sure it is really needed but otherwise that statement really should go away). Also fix the wrong src argument type, even though they weren't really used. ---

[Mesa-dev] [PATCH 4/5] gallivm: use uint build context for mask instead of float

2013-04-18 Thread sroland
From: Roland Scheidegger srol...@vmware.com Unsurprisingly noone was using it except for grabbing builder. --- src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c

[Mesa-dev] [PATCH 5/5] gallivm: implement switch opcode

2013-04-18 Thread sroland
From: Roland Scheidegger srol...@vmware.com Should be able to handle all things which make this tricky to implement. Fallthroughs, including most notably into/out of default, should be handled correctly but are quite a mess. If we see largely unoptimized switches in the wild should probably think

[Mesa-dev] [PATCH 1/2] gallivm: Add no_rho_opt debug option

2013-04-16 Thread sroland
From: Roland Scheidegger srol...@vmware.com This will calculate rho correctly as sqrt(max((ds/dx)^2 + (dt/dx)^2 + (dr/dx)^2), (ds/dx)^2 + (dt/dx)^2 + (dr/dx)^2)) instead of max(|ds/dx|,|dt/dx|,|dr/dx|,|ds/dy|,|dt/dy,|dr/dy|) (for 3 coords - 2 coords work analogous, for 1 coord there's no point

[Mesa-dev] [PATCH 2/2] gallivm: change cubemaps / derivatives handling, take 55

2013-04-16 Thread sroland
From: Roland Scheidegger srol...@vmware.com Turns out the previous fix for handling per-pixel face selection and derivatives didn't work out that well - the derivatives were wrong by quite a bit, in theory transformation of the derivatives into cube space should work, but would be _a lot_ more

[Mesa-dev] [PATCH] gallivm: fix small but severe bug in handling multiple lod level strides

2013-04-14 Thread sroland
From: Roland Scheidegger srol...@vmware.com Inserting the value for the second quad in the wrong place for the following shuffle. This meant the row or image stride was undefined which is quite catastrophic, can lead to bogus texels fetched or just segfault. This code is only hit for SoA path

[Mesa-dev] [PATCH] gallivm: some minor cube map cleanup

2013-04-04 Thread sroland
From: Roland Scheidegger srol...@vmware.com The ar_ge_as_at variable was just very very confusing since the condition was actually the other way around (as_at_ge_ar). So change the condition (and the selects depending on it) to match the variable name. Also, while here, change the chosen major

[Mesa-dev] [PATCH] gallivm: use f16c hw support for float-half and half-float conversion

2013-04-02 Thread sroland
From: Roland Scheidegger srol...@vmware.com Should be way faster of course on cpus supporting this (includes AMD Bulldozer and Jaguar cores, Intel Ivy Bridge and up (except budget models)). Passes piglit fbo-blending-formats GL_ARB_texture_float -auto on Ivy Bridge. ---

[Mesa-dev] [PATCH 1/3] gallivm: minor rho calculation optimization for 1 or 3 coords

2013-04-02 Thread sroland
From: Roland Scheidegger srol...@vmware.com Using a different packing for the single coord case should save a shuffle. Plus some minor style fixes. --- src/gallium/auxiliary/gallivm/lp_bld_quad.c | 20 +++- src/gallium/auxiliary/gallivm/lp_bld_sample.c | 31

[Mesa-dev] [PATCH 2/3] gallivm: do per-pixel cube face selection (finally!!!)

2013-04-02 Thread sroland
From: Roland Scheidegger srol...@vmware.com This proved to be tricky, the problem is that after selection/mirroring we cannot calculate reasonable derivatives (if not all pixels in a quad end up on the same face the derivatives could get randomly exceedingly large). However, it is actually quite

[Mesa-dev] [PATCH 3/3] gallivm: honor explicit derivatives values for cube maps.

2013-04-02 Thread sroland
From: Roland Scheidegger srol...@vmware.com This is trivial now, though need to make sure we pass all the necessary derivative values (which is 3 each for ddx/ddy not 2). Untested (no piglit test) however since the transform works the same as implicit derivatives this should probably work

[Mesa-dev] [PATCH 1/2] gallivm: consolidate code for float-to-half and float-to-packed conversion.

2013-03-29 Thread sroland
From: Roland Scheidegger srol...@vmware.com This replaces the existing float-to-half implementation. There are definitely a couple of differences - the old implementation had unspecified(?) rounding behavior, and could at least in theory construct Inf values out of NaNs. NaNs and Infs should now

[Mesa-dev] [PATCH 2/2] gallivm: bring back optimized but incorrect float to smallfloat optimizations

2013-03-29 Thread sroland
From: Roland Scheidegger srol...@vmware.com Conceptually the same as previously done in float_to_half. Should cut down number of instructions from 14 to 10 or so, but will promote some NaNs to Infs, so it's disabled. It gets a bit tricky though handling all the cases correctly... Passes basic

[Mesa-dev] [PATCH] gallivm: consolidate some half-to-float and r11g11b10-to-float code

2013-03-24 Thread sroland
From: Roland Scheidegger srol...@vmware.com Similar enough that we can try to use shared code. As far as I can tell this should also fix an issue with negative values for half-to-float conversion (not noticed in tests). --- src/gallium/auxiliary/gallivm/lp_bld_conv.c| 46

[Mesa-dev] [PATCH] gallivm: consolidate code for float-to-half and float-to-packed conversion.

2013-03-24 Thread sroland
From: Roland Scheidegger srol...@vmware.com This replaces the existing float-to-half implementation. There are definitely a couple of differences - the old implementation had unspecified(?) rounding behavior, and could at least in theory construct Inf values out of NaNs. NaNs and Infs should now

[Mesa-dev] [PATCH] gallivm: move code for dealing with rgb9e5 and r11g11b10 formats to own file

2013-03-23 Thread sroland
From: Roland Scheidegger srol...@vmware.com This is really not generic conversion stuff and the code very particular to these formats. --- src/gallium/auxiliary/Makefile.sources |1 + src/gallium/auxiliary/gallivm/lp_bld_conv.c| 329 -

[Mesa-dev] [PATCH] gallivm: Add code for rgb9e5 shared exponent format to float conversion

2013-03-22 Thread sroland
From: Roland Scheidegger srol...@vmware.com And use this (and the code for r11g11b10 packed float to float conversion) in the soa texturing code (the generated code looks quite good). Should be an order of magnitude faster probably than using the fallback (not measured). Tested with piglit

[Mesa-dev] [PATCH] llvmpipe: add EXT_packed_float render target format support

2013-03-21 Thread sroland
From: Roland Scheidegger srol...@vmware.com New conversion code to handle conversion from/to r11g11b10 AoS to/from SoA floats, and also add code for conversion from rgb9e5 AoS to float SoA (which works pretty much the same as r11g11b10 except for the packing). (This code should also be used for

[Mesa-dev] [PATCH] llvmpipe: add EXT_packed_float render target format support

2013-03-21 Thread sroland
From: Roland Scheidegger srol...@vmware.com New conversion code to handle conversion from/to r11g11b10 AoS to/from SoA floats, and also add code for conversion from rgb9e5 AoS to float SoA (which works pretty much the same as r11g11b10 except for the packing). (This code should also be used for

[Mesa-dev] [PATCH] gallivm: fix returning unconditionally from main on TGSI_OPCODE_RET

2013-03-15 Thread sroland
From: Roland Scheidegger srol...@vmware.com If we're in some conditional we must not return, or the code after the condition is never executed. (Probably the same for loops.) This fixes https://bugs.freedesktop.org/show_bug.cgi?id=62357. Note: This is a candidate for the stable branches. ---

[Mesa-dev] [PATCH] gallivm: fix return opcode handling in main function of a shader

2013-03-15 Thread sroland
From: Roland Scheidegger srol...@vmware.com If we're in some conditional or loop we must not return, or the code after the condition is never executed. (v2): And, we also can't just continue as nothing happened, since the mask update code would later check if we actually have a mask, so we need

[Mesa-dev] [PATCH 1/3] softpipe: don't assert when creating surfaces with multiple layers

2013-03-13 Thread sroland
From: Roland Scheidegger srol...@vmware.com We can't handle them yet, however we can safely just warn (we will just render to first layer, which is fine since we can't handle rendertarget system value neither). Also make behavior more predictable with buffer surfaces (it would sometimes hit bogus

[Mesa-dev] [PATCH 2/3] llvmpipe: don't assert when trying to render to surfaces with multiple layers

2013-03-13 Thread sroland
From: Roland Scheidegger srol...@vmware.com instead just warn when creating the surface, rendering will simply happen to first layer. --- src/gallium/drivers/llvmpipe/lp_scene.c |2 -- src/gallium/drivers/llvmpipe/lp_texture.c |3 +++ 2 files changed, 3 insertions(+), 2 deletions(-)

[Mesa-dev] [PATCH 3/3] tgsi: fix sample_d emit for arrays

2013-03-13 Thread sroland
From: Roland Scheidegger srol...@vmware.com Those cases were apparently forgotten. --- src/gallium/auxiliary/tgsi/tgsi_exec.c | 30 +++--- 1 file changed, 11 insertions(+), 19 deletions(-) diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.c

[Mesa-dev] [PATCH 2/3] tgsi: emit code for SVIEWINFO and SAMPLE_I

2013-03-08 Thread sroland
From: Roland Scheidegger srol...@vmware.com Can handle them since the single sampler interface was introduced. --- src/gallium/auxiliary/tgsi/tgsi_exec.c | 18 +- 1 file changed, 13 insertions(+), 5 deletions(-) diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.c

[Mesa-dev] [PATCH] gallivm: clean up passing derivatives around

2013-03-08 Thread sroland
From: Roland Scheidegger srol...@vmware.com Previously, the derivatives were calculated and passed in a packed form to the sample code (for implicit derivatives, explicit derivatives were packed to the same format). There's several reasons why this wasn't such a good idea: 1) the derivatives may

[Mesa-dev] [PATCH] tgsi: handle projection modifier for array textures.

2013-03-05 Thread sroland
From: Roland Scheidegger srol...@vmware.com This partly reverts 6ace2e41da7dded630d932d03bacb7e14a93d47a. Apparently with GL_MESA_texture_array fixed-function texturing with texture arrays is possible, and hence we have to handle TXP. (Though noone seems to know the semantics, softpipe now does

[Mesa-dev] [PATCH] draw/llvm: skip clipping and viewport transform if there's no position output

2013-03-01 Thread sroland
From: Roland Scheidegger srol...@vmware.com With glsl 1.40 writing position is not required (useful for transform feedback, though in fact it's still possible to rasterize such geometry even if the results aren't too well defined). Prevents crashes in that case. Fixes piglit

[Mesa-dev] [PATCH] llvmpipe: don't assert on illegal surface creation.

2013-03-01 Thread sroland
From: Roland Scheidegger srol...@vmware.com Since c8eb2d0e829d0d2aea6a982620da0d3cfb5982e2 llvmpipe checks if it's actually legal to create a surface. The opengl state tracker doesn't quite obey this so for now just warn instead of assert. Also warn instead of disabled assert when creating

[Mesa-dev] [PATCH] tgsi: add texel offsets and derivatives to sampler interface

2013-03-01 Thread sroland
From: Roland Scheidegger srol...@vmware.com Something I never got around to implement, but this is the tgsi execution side for implementing texel offsets (for ordinary texturing) and explicit derivatives for sampling (though I guess the ordering of the components for the derivs parameters is

[Mesa-dev] [PATCH 1/2] draw: fix no position output in non-llvm pipeline.

2013-03-01 Thread sroland
From: Roland Scheidegger srol...@vmware.com It seems easiest (and best) if we simply skip all the later stages (after stream output). (This is different to the llvm case at least for now where we will simply try to render garbage, though both behaviors should be correct.) Fixes piglit

[Mesa-dev] [PATCH 1/2] gallivm: add support for texel offsets for ordinary texturing.

2013-02-28 Thread sroland
From: Roland Scheidegger srol...@vmware.com This was previously only handled for texelFetch (much easier). Depending on the wrap mode this works slightly differently (for somewhat efficient implementation), hence have to do that separately in all roughly 137 places - it is easy if we use fixed

[Mesa-dev] [PATCH 2/2] llvmpipe: bump glsl version to 130

2013-02-28 Thread sroland
From: Roland Scheidegger srol...@vmware.com texel offsets should have been the last missing feature (not sure if anything is actually missing for 140). In any case we still don't do OpenGL 3.0 (missing MSAA which will be difficult, plus EXT_packed_float, ARB_depth_buffer_float and

[Mesa-dev] [PATCH] llvmpipe: bump glsl version to 140

2013-02-28 Thread sroland
From: Roland Scheidegger srol...@vmware.com texel offsets should have been the last missing feature (not sure if anything is actually missing for 140). In any case we still don't do OpenGL 3.0 (missing MSAA which will be difficult, plus EXT_packed_float, ARB_depth_buffer_float and

[Mesa-dev] [PATCH] llvmpipe: check buffers in llvmpipe_is_resource_referenced

2013-02-27 Thread sroland
From: Roland Scheidegger srol...@vmware.com Now that buffers can be used as textures or render targets make sure they aren't skipped. Fix suggested by Jose Fonseca. --- src/gallium/drivers/llvmpipe/lp_surface.c | 14 +++--- src/gallium/drivers/llvmpipe/lp_texture.c |4 +++- 2

[Mesa-dev] [PATCH] llvmpipe: support rendering to buffer render targets.

2013-02-26 Thread sroland
From: Roland Scheidegger srol...@vmware.com Unfortunately not usable from OpenGL, and no cap bit. Pretty similar to a 1d texture, though allows specifying a start element. The util code for handling clears also needs adjustments (and fix a bug causing crashes for handling pure integer formats

[Mesa-dev] [PATCH] llvmpipe: support GL_ARB_texture_buffer_object/GL_ARB_texture_buffer_range

2013-02-22 Thread sroland
From: Roland Scheidegger srol...@vmware.com This also fixes not honoring first/last_layer view parameters for array textures, plus not honoring last_level view parameter for all textures (neither is really used by OpenGL). This mostly passes piglit arb_texture_buffer_object tests (it needs,

[Mesa-dev] [PATCH] draw: make sure pipeline is revalidated when sampler views or samplers change.

2013-02-22 Thread sroland
From: Roland Scheidegger srol...@vmware.com Since with llvm execution parts of sampler view and sampler state is baked into the shader, we need to revalidate otherwise the wrong shader might get used. (Not completely sure but I think this would not be required for non-llvm case, along with

[Mesa-dev] [PATCH 1/2] gallium/docs: improve text about resources a bit.

2013-02-21 Thread sroland
From: Roland Scheidegger srol...@vmware.com This clarifies some things and gets rid of some old stuff. The most significant one is probably that buffers cannot have formats (nearly all drivers completely ignored format and used width0 as byte size already in any case). There seems to be no use

[Mesa-dev] [PATCH 2/2] llvmpipe: simplify buffer allocation logic.

2013-02-21 Thread sroland
From: Roland Scheidegger srol...@vmware.com Now with buffer formats clarification don't need all that logic any longer. (Note that it never would have worked in any case, because blockwidth and blockheight were swapped any allocation with multi-byte format would have had zero size.) ---

[Mesa-dev] [PATCH] draw: make sure key size is calculated consistently.

2013-02-19 Thread sroland
From: Roland Scheidegger srol...@vmware.com Some parts calculated key size by using shader information, others by using the pipe_vertex_element information. Since it is perfectly valid to have more vertex_elements set than the vertex shader is using those may not be the same, so we weren't

[Mesa-dev] [PATCH] llvmpipe: lp_resource_copy cleanup

2013-02-19 Thread sroland
From: Roland Scheidegger srol...@vmware.com We don't need to flush resources for each layer, and since we don't actually care about layer at all in the flush function just drop the parameter. Also we can use util_copy_box instead of repeated util_copy_rect. ---

[Mesa-dev] [PATCH] gallivm: fix indirect src register fetches requiring bitcast

2013-02-19 Thread sroland
From: Roland Scheidegger srol...@vmware.com For constant and temporary register fetches, the bitcasts weren't done correctly for the indirect case, leading to crashes due to type mismatches. Simply do the bitcasts after fetching (much simpler than fixing up the load pointer for the various

[Mesa-dev] [PATCH] draw: make sure key size is calculated consistently.

2013-02-18 Thread sroland
From: Roland Scheidegger srol...@vmware.com Some parts calculated key size by using shader information, others by using the pipe_vertex_element information. Since it is perfectly valid to have more vertex_elements set than the vertex shader is using those may not be the same, so we weren't

[Mesa-dev] [PATCH] gallivm/tgsi: fix src modifier fetching with non-float types.

2013-02-15 Thread sroland
From: Roland Scheidegger srol...@vmware.com Need to take the type into account. Also, if we want to allow mov's with modifiers we need to pick a type (assume float). v2: don't allow all modifiers on all type, in particular don't allow absolute on non-float types and don't allow negate on

[Mesa-dev] [PATCH 1/3] gallivm: DIV shouldn't be deprecated.

2013-02-14 Thread sroland
From: Roland Scheidegger srol...@vmware.com (Though it looks glsl won't emit it.) --- src/gallium/auxiliary/gallivm/lp_bld_tgsi.c |1 - 1 file changed, 1 deletion(-) diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi.c b/src/gallium/auxiliary/gallivm/lp_bld_tgsi.c index

[Mesa-dev] [PATCH 2/3] gallivm: fix src modifier fetching with non-float types.

2013-02-14 Thread sroland
From: Roland Scheidegger srol...@vmware.com Need to take the type into account. Also, if we want to allow mov's with modifiers we need to pick a type (assume float). --- src/gallium/auxiliary/gallivm/lp_bld_tgsi.c | 54 ++- 1 file changed, 52 insertions(+), 2

[Mesa-dev] [PATCH 3/3] gallivm/tgsi: fix issues with sample opcodes

2013-02-14 Thread sroland
From: Roland Scheidegger srol...@vmware.com We need to encode them as Texture instructions since the NumOffsets field is encoded there. However, we don't encode the actual target in there, this is derived from the sampler view src later. --- src/gallium/auxiliary/gallivm/lp_bld_tgsi.h |

[Mesa-dev] [PATCH] gallium: fix tgsi SAMPLE_L opcode to use separate source for explicit lod

2013-02-11 Thread sroland
From: Roland Scheidegger srol...@vmware.com It looks like using coord.w as explicit lod value is a mistake, most likely because some dx10 docs had it specified that way. Seems this was changed though: http://msdn.microsoft.com/en-us/library/windows/desktop/hh447229%28v=vs.85%29.aspx - let's just

[Mesa-dev] [PATCH 1/2] llvmpipe: first steps of adding dual source blend support

2013-02-07 Thread sroland
From: Roland Scheidegger srol...@vmware.com This adds support of the additional blending factors to the blend function itself, and also enables testing of it in lp_test_blend (which passes). Still need to add the glue code of linking fs shader outputs to blend inputs in llvmpipe, and probably

[Mesa-dev] [PATCH 2/2] llvmpipe: implement dual source blending

2013-02-07 Thread sroland
From: Roland Scheidegger srol...@vmware.com link up the fs outputs and blend inputs, and make sure the second blend source is correctly loaded and converted (which is quite complex). There's a slight refactoring of the monster generate_unswizzled_blend() function where it makes sense to factor

<    1   2   3   4   5   6   7   >