Re: [Mesa3d-dev] RFC: allow resource_copy_region between different (yet compatabile) formats

2010-09-06 Thread Roland Scheidegger
On 06.09.2010 15:57, José Fonseca wrote:
 I'd like to know if there's any objection to change the
 resource_copy_region semantics to allow copies between different yet
 compatible formats, where the definition of compatible formats is:
 
   formats for which copying the bytes from the source resource
 unmodified to the destination resource will achieve the same effect of a
 textured quad blitter
 
 There is an helper function util_is_format_compatible() to help making
 this decision, and these are the non-trivial conversions that this
 function currently recognizes, (which was produced by
 u_format_compatible_test.c):
 
   b8g8r8a8_unorm - b8g8r8x8_unorm
   a8r8g8b8_unorm - x8r8g8b8_unorm
   b5g5r5a1_unorm - b5g5r5x1_unorm
   b4g4r4a4_unorm - b4g4r4x4_unorm
   l8_unorm - r8_unorm
   i8_unorm - l8_unorm
   i8_unorm - a8_unorm
   i8_unorm - r8_unorm
   l16_unorm - r16_unorm
   z24_unorm_s8_uscaled - z24x8_unorm
   s8_uscaled_z24_unorm - x8z24_unorm
   r8g8b8a8_unorm - r8g8b8x8_unorm
   a8b8g8r8_srgb - x8b8g8r8_srgb
   b8g8r8a8_srgb - b8g8r8x8_srgb
   a8r8g8b8_srgb - x8r8g8b8_srgb
   a8b8g8r8_unorm - x8b8g8r8_unorm
   r10g10b10a2_uscaled - r10g10b10x2_uscaled
   r10sg10sb10sa2u_norm - r10g10b10x2_snorm
 
 Note that format compatibility is not commutative.
 
 For software drivers this means that memcpy/util_copy_rect() will
 achieve the correct result.
 
 For hardware drivers this means that a VRAM-VRAM 2D blit engine will
 also achieve the correct result.
 
 So I'd expect no implementation change of resource_copy_region() for any
 driver AFAICT. But I'd like to be sure.
 
 Jose

José,

this looks good to me. Note that the analogous function in d3d10,
ResourceCopyRegion, only requires formats to be in the same typeless
group (hence same number of bits for all components), which is certainly
a broader set of compatible formats to what util_is_format_compatible()
  is outputting. As far as I can tell, no conversion is happening at all
in d3d10, this is just like memcpy. I think we might want to support
that in the future as well, but for now extending this to the formats
you listed certainly sounds ok.

Roland

--
This SF.net Dev2Dev email is sponsored by:

Show off your parallel programming skills.
Enter the Intel(R) Threading Challenge 2010.
http://p.sf.net/sfu/intel-thread-sfd
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] RFC: allow resource_copy_region between different (yet compatabile) formats

2010-09-06 Thread Roland Scheidegger
On 06.09.2010 17:16, Luca Barbieri wrote:
 On Mon, Sep 6, 2010 at 3:57 PM, José Fonseca jfons...@vmware.com wrote:
 I'd like to know if there's any objection to change the
 resource_copy_region semantics to allow copies between different yet
 compatible formats, where the definition of compatible formats is:
 
 I was about to propose something like this.
 
 How about a much more powerful change though, that would make any pair
 of non-blocked format of the same bit depth compatible?
 This way you could copy z24s8 to r8g8b8a8, for instance.
I am not sure this makes a lot of sense. There's no guarantee the bit
layout of these is even remotely similar (and it likely won't be on any
decent hardware). I think the dx10 restriction makes sense here.

 
 In addition to this, how about explicitly allowing sampler views to
 use a compatible format, and add the ability for surfaces to use a
 compatible format too? (with a new parameter to get_tex_surface)
Note that get_tex_surface is dead (in gallium-array-textures - not
merged yet but it will happen eventually). Its replacement (for render
targets or depth stencil) create_surface(), already can be supplied with
a format parameter. Compatible formats though should ultimately end up
to something similar to dx10.

 
 This would allow for instance to implement glBlitFramebuffer on
 stencil buffers by reinterpreting the buffer as r8g8b8a8, and allow
 the blitter module to copy depth/stencil buffers by simply treating
 them as color buffers.
 
 The only issue is that some drivers might hold depth/stencil surfaces
 in compressed formats that cannot be interpreted as a color format,
 and not have any mechanism for keeping temporaries or doing
 conversions internally.
I think that's a pretty big if. I could be wrong but I think operations
like blitting stencil buffers are pretty rare anyway (afaik other apis
don't allow things like that).

 
 DirectX seems to have something like this with the _TYPELESS formats.
Yes, and it precisely won't allow you to interpret s24_z8 as r8g8b8a8 or
other wonky stuff. Only if all components have same number of bits.

Roland


--
This SF.net Dev2Dev email is sponsored by:

Show off your parallel programming skills.
Enter the Intel(R) Threading Challenge 2010.
http://p.sf.net/sfu/intel-thread-sfd
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] RFC: allow resource_copy_region between different (yet compatabile) formats

2010-09-06 Thread Roland Scheidegger
On 06.09.2010 22:03, Luca Barbieri wrote:
 This way you could copy z24s8 to r8g8b8a8, for instance.
 
 I am not sure this makes a lot of sense. There's no guarantee the bit
 layout of these is even remotely similar (and it likely won't be on any
 decent hardware). I think the dx10 restriction makes sense here.
 
 Yes, it depends on the flexibility of the hardware and the driver.
 Due to depth textures, I think it is actually likely that you can
 easily treat depth as color.
 
 The worst issue right now is that stencil cannot be accessed in a
 sensible way at all, which makes implementing glBlitFramebuffer of
 STENCIL_BIT with NEAREST and different rect sizes impossible.
 Some cards (r600+ at least) can write stencil in shaders, but on some
 you must reinterpret the surface.
 And resource_copy_region does not support stretching, so it can't be used.
 
 Since not all cards can write stencil in shaders, one either needs to
 be able to bind depth/stencil as a color buffer, or extend
 resource_copy_region to support stretching with nearest filtering, or
 both (possibly in addition to having the option of using stencil
 export in shaders).
Yes, accessing stencil is a problem - other apis just disallow that...
There are other problems with accessing stencil, like for instance
WritePixels with multisampled depth/stencil buffer (which you can't
really map hence cpu fallbacks don't even work). Plus you really don't
want any cpu fallbacks anyway.
Using stencil export (ARB_shader_stencil_export) seems like a clean
solution, but as you said not all cards support it.
Plus you can't actually get the stencil values with texture sampling
neither, so this doesn't help that much (well you can't get them with GL
though hardware may support it I guess).
When I said it won't work with decent hardware, I really meant it won't
work due to compression. Now, it's quite possible this can be disabled
on any chip, but you don't know that before hence you need to jump
through hoops to get an uncompressed version of your compressed buffer
later.
Do applications actually really ever use blitframebuffer with stencil
bit (with different sizes, otherwise resource_copy_region could be
used)? It just seems to me that casts to completely different formats
(well still with same total bitwidth, but still) are very unclean, but I
don't have any good solution of how to solve this - if noone ever uses
this in practice cpu fallback is just fine, but as said won't work for
multisampled buffers for instance neither.

 
 Other things would likely benefit, such as GL_NV_copy_depth_to_color.

Roland

--
This SF.net Dev2Dev email is sponsored by:

Show off your parallel programming skills.
Enter the Intel(R) Threading Challenge 2010.
http://p.sf.net/sfu/intel-thread-sfd
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] ARB draw buffers + texenv program

2010-04-13 Thread Roland Scheidegger
On 13.04.2010 02:52, Dave Airlie wrote:
 On Tue, Apr 6, 2010 at 2:00 AM, Brian Paul bri...@vmware.com wrote:
 Dave Airlie wrote:
 Just going down the r300g piglit failures and noticed fbo-drawbuffers
 failed, I've no idea
 if this passes on Intel hw, but it appears the texenvprogram really
 needs to understand the
 draw buffers. The attached patch fixes it here for me on r300g anyone
 want to test this on Intel
 with the piglit test before/after?
 The piglit test passes as-is with Mesa/swrast and NVIDIA.

 It fails with gallium/softpipe both with and w/out your patch.

 I think that your patch is on the right track.  But multiple render targets
 are still a bit of an untested area in the st/mesa code.

 One thing: the patch introduces a dependency on buffer state in the
 texenvprogram code so in state.c we should check for the _NEW_BUFFERS flag.

 Otherwise, I'd like to debug the softpipe failure a bit further to see
 what's going on.  Perhaps you could hold off on committing this for a bit...
 
 Well Eric pointed out to me the fun line in the spec
 
 (3) Should gl_FragColor be aliased to gl_FragData[0]?
 
   RESOLUTION: No.  A shader should write either gl_FragColor, or
   gl_FragData[n], but not both.
 
   Writing to gl_FragColor will write to all draw buffers specified
   with DrawBuffersARB.
 
 So I was really just masking the issue with this. From what I can see
 softpipe messes up and I'm not sure where we should be fixing this.
 swrast does okay, its just whether we should be doing something in gallium
 or in the drivers is open.

Hmm yes looks like that's not really well defined. I guess there are
several options here:
1) don't do anything at the state tracker level, and assume that if a
fragment shader only writes to color 0 but has several color buffers
bound the color is meant to go to all outputs. Looks like that's what
nv50 is doing today. If a shader writes to FragData[0] but not others,
in gallium that would mean that output still gets replicated to all
outputs, but since the spec says unwritten outputs are undefined that
would be just fine (for OpenGL - not sure about other APIs).
2) Use some explicit means to distinguish FragData[] from FragColor in
gallium. For instance, could use different semantic name (like
TGSI_SEMANTIC_COLOR and TGSI_SEMANTIC_GENERIC for the respective
outputs). Or could have a flag somewhere (not quite sure where) saying
if color output is to be replicated to all buffers.
3) Translate away the single color output in state tracker to multiple
outputs.

I don't like option 3) though. Means we need to recompile if the
attached buffers change. Moreover, it seems both new nvidia and AMD
chips (r600 has MULTIWRITE_ENABLE bit) handle this just fine in hw.
I don't like option 1) neither, that kind of implicit behavior might be
ok but this kind of guesswork isn't very nice imho.

Opinions?

Roland

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] ARB draw buffers + texenv program

2010-04-13 Thread Roland Scheidegger
On 13.04.2010 20:28, Alex Deucher wrote:
 On Tue, Apr 13, 2010 at 2:21 PM, Corbin Simpson
 mostawesomed...@gmail.com wrote:
 On Tue, Apr 13, 2010 at 6:42 AM, Roland Scheidegger srol...@vmware.com 
 wrote:
 On 13.04.2010 02:52, Dave Airlie wrote:
 On Tue, Apr 6, 2010 at 2:00 AM, Brian Paul bri...@vmware.com wrote:
 Dave Airlie wrote:
 Just going down the r300g piglit failures and noticed fbo-drawbuffers
 failed, I've no idea
 if this passes on Intel hw, but it appears the texenvprogram really
 needs to understand the
 draw buffers. The attached patch fixes it here for me on r300g anyone
 want to test this on Intel
 with the piglit test before/after?
 The piglit test passes as-is with Mesa/swrast and NVIDIA.

 It fails with gallium/softpipe both with and w/out your patch.

 I think that your patch is on the right track.  But multiple render 
 targets
 are still a bit of an untested area in the st/mesa code.

 One thing: the patch introduces a dependency on buffer state in the
 texenvprogram code so in state.c we should check for the _NEW_BUFFERS 
 flag.

 Otherwise, I'd like to debug the softpipe failure a bit further to see
 what's going on.  Perhaps you could hold off on committing this for a 
 bit...
 Well Eric pointed out to me the fun line in the spec

 (3) Should gl_FragColor be aliased to gl_FragData[0]?

   RESOLUTION: No.  A shader should write either gl_FragColor, or
   gl_FragData[n], but not both.

   Writing to gl_FragColor will write to all draw buffers specified
   with DrawBuffersARB.

 So I was really just masking the issue with this. From what I can see
 softpipe messes up and I'm not sure where we should be fixing this.
 swrast does okay, its just whether we should be doing something in gallium
 or in the drivers is open.
 Hmm yes looks like that's not really well defined. I guess there are
 several options here:
 1) don't do anything at the state tracker level, and assume that if a
 fragment shader only writes to color 0 but has several color buffers
 bound the color is meant to go to all outputs. Looks like that's what
 nv50 is doing today. If a shader writes to FragData[0] but not others,
 in gallium that would mean that output still gets replicated to all
 outputs, but since the spec says unwritten outputs are undefined that
 would be just fine (for OpenGL - not sure about other APIs).
 2) Use some explicit means to distinguish FragData[] from FragColor in
 gallium. For instance, could use different semantic name (like
 TGSI_SEMANTIC_COLOR and TGSI_SEMANTIC_GENERIC for the respective
 outputs). Or could have a flag somewhere (not quite sure where) saying
 if color output is to be replicated to all buffers.
 3) Translate away the single color output in state tracker to multiple
 outputs.

 I don't like option 3) though. Means we need to recompile if the
 attached buffers change. Moreover, it seems both new nvidia and AMD
 chips (r600 has MULTIWRITE_ENABLE bit) handle this just fine in hw.
 I don't like option 1) neither, that kind of implicit behavior might be
 ok but this kind of guesswork isn't very nice imho.
 Whatever's easiest, just document it. I'd be cool with:

 DECL IN[0], COLOR, PERSPECTIVE
 DECL OUT[0], COLOR
 MOV OUT[0], IN[0]
 END

 Effectively being a write to all color buffers, however, this one from
 progs/tests/drawbuffers:

 DCL IN[0], COLOR, LINEAR
 DCL OUT[0], COLOR
 DCL OUT[1], COLOR[1]
 IMM FLT32 { 1., 0., 0., 0. }
  0: MOV OUT[0], IN[0]
  1: SUB OUT[1], IMM[0]., IN[0]
  2: END

 Would then double-write the second color buffer. Unpleasant. Language
 like this would work, I suppose?

 
 If only one color output is declared, writes to the color output shall
 be redirected to all bound color buffers. Otherwise, color outputs
 shall be bound to their specific color buffer.
 
 
 Also, keep in mind that writing to multiple color buffers uses
 additional memory bandwidth, so for performance, we should only do so
 when required.

Do apps really have several color buffers bound but only write to one,
leaving the state of the others undefined in the process? Sounds like a
poor app to begin with to me.
Actually, I would restrict that language above further, so only color
output 0 will get redirected to all buffers if it's the only one
written. As said though I'd think some explicit bits somewhere are
cleaner. I'm not yet sure that the above would really work for all APIs,
it is possible some say other buffers not written to are left as is
instead of undefined.

Roland

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https

Re: [Mesa3d-dev] ARB draw buffers + texenv program

2010-04-13 Thread Roland Scheidegger
On 14.04.2010 00:38, Dave Airlie wrote:
 On Wed, Apr 14, 2010 at 8:33 AM, Roland Scheidegger srol...@vmware.com 
 wrote:
 On 13.04.2010 20:28, Alex Deucher wrote:
 On Tue, Apr 13, 2010 at 2:21 PM, Corbin Simpson
 mostawesomed...@gmail.com wrote:
 On Tue, Apr 13, 2010 at 6:42 AM, Roland Scheidegger srol...@vmware.com 
 wrote:
 On 13.04.2010 02:52, Dave Airlie wrote:
 On Tue, Apr 6, 2010 at 2:00 AM, Brian Paul bri...@vmware.com wrote:
 Dave Airlie wrote:
 Just going down the r300g piglit failures and noticed fbo-drawbuffers
 failed, I've no idea
 if this passes on Intel hw, but it appears the texenvprogram really
 needs to understand the
 draw buffers. The attached patch fixes it here for me on r300g anyone
 want to test this on Intel
 with the piglit test before/after?
 The piglit test passes as-is with Mesa/swrast and NVIDIA.

 It fails with gallium/softpipe both with and w/out your patch.

 I think that your patch is on the right track.  But multiple render 
 targets
 are still a bit of an untested area in the st/mesa code.

 One thing: the patch introduces a dependency on buffer state in the
 texenvprogram code so in state.c we should check for the _NEW_BUFFERS 
 flag.

 Otherwise, I'd like to debug the softpipe failure a bit further to see
 what's going on.  Perhaps you could hold off on committing this for a 
 bit...
 Well Eric pointed out to me the fun line in the spec

 (3) Should gl_FragColor be aliased to gl_FragData[0]?

   RESOLUTION: No.  A shader should write either gl_FragColor, or
   gl_FragData[n], but not both.

   Writing to gl_FragColor will write to all draw buffers specified
   with DrawBuffersARB.

 So I was really just masking the issue with this. From what I can see
 softpipe messes up and I'm not sure where we should be fixing this.
 swrast does okay, its just whether we should be doing something in 
 gallium
 or in the drivers is open.
 Hmm yes looks like that's not really well defined. I guess there are
 several options here:
 1) don't do anything at the state tracker level, and assume that if a
 fragment shader only writes to color 0 but has several color buffers
 bound the color is meant to go to all outputs. Looks like that's what
 nv50 is doing today. If a shader writes to FragData[0] but not others,
 in gallium that would mean that output still gets replicated to all
 outputs, but since the spec says unwritten outputs are undefined that
 would be just fine (for OpenGL - not sure about other APIs).
 2) Use some explicit means to distinguish FragData[] from FragColor in
 gallium. For instance, could use different semantic name (like
 TGSI_SEMANTIC_COLOR and TGSI_SEMANTIC_GENERIC for the respective
 outputs). Or could have a flag somewhere (not quite sure where) saying
 if color output is to be replicated to all buffers.
 3) Translate away the single color output in state tracker to multiple
 outputs.

 I don't like option 3) though. Means we need to recompile if the
 attached buffers change. Moreover, it seems both new nvidia and AMD
 chips (r600 has MULTIWRITE_ENABLE bit) handle this just fine in hw.
 I don't like option 1) neither, that kind of implicit behavior might be
 ok but this kind of guesswork isn't very nice imho.
 Whatever's easiest, just document it. I'd be cool with:

 DECL IN[0], COLOR, PERSPECTIVE
 DECL OUT[0], COLOR
 MOV OUT[0], IN[0]
 END

 Effectively being a write to all color buffers, however, this one from
 progs/tests/drawbuffers:

 DCL IN[0], COLOR, LINEAR
 DCL OUT[0], COLOR
 DCL OUT[1], COLOR[1]
 IMM FLT32 { 1., 0., 0., 0. }
  0: MOV OUT[0], IN[0]
  1: SUB OUT[1], IMM[0]., IN[0]
  2: END

 Would then double-write the second color buffer. Unpleasant. Language
 like this would work, I suppose?

 
 If only one color output is declared, writes to the color output shall
 be redirected to all bound color buffers. Otherwise, color outputs
 shall be bound to their specific color buffer.
 
 Also, keep in mind that writing to multiple color buffers uses
 additional memory bandwidth, so for performance, we should only do so
 when required.
 Do apps really have several color buffers bound but only write to one,
 leaving the state of the others undefined in the process? Sounds like a
 poor app to begin with to me.
 Actually, I would restrict that language above further, so only color
 output 0 will get redirected to all buffers if it's the only one
 written. As said though I'd think some explicit bits somewhere are
 cleaner. I'm not yet sure that the above would really work for all APIs,
 it is possible some say other buffers not written to are left as is
 instead of undefined.
 
 Who knows, the GL API allows for it, I don't see how we can
 arbitrarily decide to restrict it.
 
 I could write an app that uses multiple fragment programs, and
 switches between them, with two outputs buffers bound, though I'm
 possibly constructing something very arbitary.
I fail to see the problem. If you have two color buffers bound

Re: [Mesa3d-dev] gallium-resources branch merge

2010-04-10 Thread Roland Scheidegger
On 10.04.2010 14:00, Keith Whitwell wrote:
 Hmm, not sure whether to merge or squash-merge this branch.  Any thoughts?

I'm no big fan of squash merge but the history of the normal merge won't
be nice neither. Tough call, though I'd prefer a normal merge.

Roland

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] gallium-resources branch merge

2010-04-10 Thread Roland Scheidegger
On 10.04.2010 16:43, Chia-I Wu wrote:
 On Sat, Apr 10, 2010 at 8:00 PM, Keith Whitwell
 keith.whitw...@googlemail.com wrote:
 Hmm, not sure whether to merge or squash-merge this branch.  Any thoughts?
 The conversion to pipe_resource seems to be done by components.  Maybe a new
 branch that reorganize (git rebase -i) the commits in gallium-resources and
 merge the new branch to master?

I've never used git rebase -i but I'm not convinced that can give
something sensible. It wasn't done strictly by components, with a couple
merges from master (and gallium-buffer-usage-cleanup) in between and
fixes for already converted things...

Roland



--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] gallium-resources branch merge

2010-04-10 Thread Roland Scheidegger
On 10.04.2010 17:10, Keith Whitwell wrote:
 On Sat, Apr 10, 2010 at 4:05 PM, Keith Whitwell
 keith.whitw...@googlemail.com wrote:
 On Sat, Apr 10, 2010 at 3:49 PM, Roland Scheidegger srol...@vmware.com 
 wrote:
 On 10.04.2010 16:43, Chia-I Wu wrote:
 On Sat, Apr 10, 2010 at 8:00 PM, Keith Whitwell
 keith.whitw...@googlemail.com wrote:
 Hmm, not sure whether to merge or squash-merge this branch.  Any thoughts?
 The conversion to pipe_resource seems to be done by components.  Maybe a 
 new
 branch that reorganize (git rebase -i) the commits in gallium-resources and
 merge the new branch to master?
 I've never used git rebase -i but I'm not convinced that can give
 something sensible. It wasn't done strictly by components, with a couple
 merges from master (and gallium-buffer-usage-cleanup) in between and
 fixes for already converted things...

 Squash merge it is.
 
 Somewhat arbitrary decision to avoid stretching this out any further.
 
 I don't think the history that was on the branch was very useful, nor
 does inventing history seem likely to help people searching for
 regressions, etc.  The branch is effectively an atomic change, so
 let's deal with it like that...

Yeah, you're right. Thinking about it, parts of it were always broken
throughout the life of the branch or didn't even build, so squash merge
makes sense. Glad it's merged - no more conflicts fixing for merges from
master :-).

Roland


--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] Mesa (gallium-resources): gallium: fix comments for changed USAGE flags

2010-04-09 Thread Roland Scheidegger
On 09.04.2010 17:49, Keith Whitwell wrote:
 On Fri, 2010-04-09 at 08:45 -0700, Roland Scheidegger wrote:
 Module: Mesa
 Branch: gallium-resources
 Commit: faf53328d1154c51d8a59513f2bfcae62272b0bf
 URL:
 http://cgit.freedesktop.org/mesa/mesa/commit/?id=faf53328d1154c51d8a59513f2bfcae62272b0bf

 Author: Roland Scheidegger srol...@vmware.com
 Date:   Fri Apr  9 17:44:24 2010 +0200

 gallium: fix comments for changed USAGE flags

 ---

  src/gallium/auxiliary/util/u_simple_screen.h  |9 +
  src/gallium/drivers/svga/svga_winsys.h|   10 --
  src/gallium/include/pipe/p_screen.h   |2 +-
  src/gallium/include/state_tracker/sw_winsys.h |2 +-
  4 files changed, 11 insertions(+), 12 deletions(-)

 diff --git a/src/gallium/auxiliary/util/u_simple_screen.h 
 b/src/gallium/auxiliary/util/u_simple_screen.h
 index 0042277..1ba59af 100644
 --- a/src/gallium/auxiliary/util/u_simple_screen.h
 +++ b/src/gallium/auxiliary/util/u_simple_screen.h
 @@ -73,9 +73,10 @@ struct pipe_winsys
  * window systems must then implement that interface (rather than the
  * other way around...).
  *
 -* usage is a bitmask of PIPE_BUFFER_USAGE_PIXEL/VERTEX/INDEX/CONSTANT. 
 This
 -* usage argument is only an optimization hint, not a guarantee, 
 therefore
 -* proper behavior must be observed in all circumstances.
 +* usage is a bitmask of PIPE_BIND_*.
 +* XXX is this true?
 +* This usage argument is only an optimization hint, not a guarantee,
 +* therefore proper behavior must be observed in all circumstances.
 
 The new flags are no longer hints - they are supposed actually specify
 which operations are permitted on a resource.  
 
 Unfortunately I don't think this is very well enforced yet -- I intend
 to add a debug layer to sit between state-tracker and driver, based on
 the drivers/identity layer, which will check for violations of this 
 other rules.

Ok, I thought this to be the case, but wasn't sure. I'll fix the comment.
In the svga code, I actually couldn't figure out the usage flags when a
winsys buffer is created. It looks like usage is always 0, except for
queries it uses SVGA_BUFFER_USAGE_PINNED. Of course, that's not a
resource but a winsys buffer, but as far as I can tell this ends up in a
pb_buffer usage flag. Not sure if that's ok or supposed to be like that...

Roland

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] gallium-resources branch merge

2010-04-09 Thread Roland Scheidegger
On 09.04.2010 17:29, STEVE555 wrote:
 Hi all,
 I've git branched and got the latest commits from the
 gallium-resources branch and also the latest commits from git master.
 
 I did a gmake -B realclean from a prevous compile on my copy of git
 master,and did a  git checkout gallium-resources to switch to that
 branch,and did a ./autogen.sh with the following options:
 
  --prefix=/usr/local --enable-32-bit --enable-xcb --enable-gallium-nouveau
 --with-state-trackers=dri,egl,xorg,glx,vega,es --enable-motif
 --enable-gl-osmesa --disable-gallium-intel --disable-gallium-radeon
 --with-expat=/usr/lib --with-demos=xdemos,demos,trivial,tests
 --with-dri-drivers=swrast --enable-gallium-swrast --enable-gallium-svga
 --with-max-width=4096 --with-max-height=4096 --enable-debug
 
 I then did a gmake to compile my copy of gallium-resources,but it ended with
 error at the end:
This should be fixed now.

Roland

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] Mesa (gallium-resources): gallium: fix comments for changed USAGE flags

2010-04-09 Thread Roland Scheidegger
On 09.04.2010 18:22, José Fonseca wrote:
 On Fri, 2010-04-09 at 09:02 -0700, Keith Whitwell wrote:
 On Fri, 2010-04-09 at 08:59 -0700, Roland Scheidegger wrote:
 On 09.04.2010 17:49, Keith Whitwell wrote:
 On Fri, 2010-04-09 at 08:45 -0700, Roland Scheidegger wrote:
 Module: Mesa
 Branch: gallium-resources
 Commit: faf53328d1154c51d8a59513f2bfcae62272b0bf
 URL:
 http://cgit.freedesktop.org/mesa/mesa/commit/?id=faf53328d1154c51d8a59513f2bfcae62272b0bf

 Author: Roland Scheidegger srol...@vmware.com
 Date:   Fri Apr  9 17:44:24 2010 +0200

 gallium: fix comments for changed USAGE flags

 ---

  src/gallium/auxiliary/util/u_simple_screen.h  |9 +
  src/gallium/drivers/svga/svga_winsys.h|   10 --
  src/gallium/include/pipe/p_screen.h   |2 +-
  src/gallium/include/state_tracker/sw_winsys.h |2 +-
  4 files changed, 11 insertions(+), 12 deletions(-)

 diff --git a/src/gallium/auxiliary/util/u_simple_screen.h 
 b/src/gallium/auxiliary/util/u_simple_screen.h
 index 0042277..1ba59af 100644
 --- a/src/gallium/auxiliary/util/u_simple_screen.h
 +++ b/src/gallium/auxiliary/util/u_simple_screen.h
 @@ -73,9 +73,10 @@ struct pipe_winsys
  * window systems must then implement that interface (rather than the
  * other way around...).
  *
 -* usage is a bitmask of 
 PIPE_BUFFER_USAGE_PIXEL/VERTEX/INDEX/CONSTANT. This
 -* usage argument is only an optimization hint, not a guarantee, 
 therefore
 -* proper behavior must be observed in all circumstances.
 +* usage is a bitmask of PIPE_BIND_*.
 +* XXX is this true?
 +* This usage argument is only an optimization hint, not a guarantee,
 +* therefore proper behavior must be observed in all circumstances.
 The new flags are no longer hints - they are supposed actually specify
 which operations are permitted on a resource.  

 Unfortunately I don't think this is very well enforced yet -- I intend
 to add a debug layer to sit between state-tracker and driver, based on
 the drivers/identity layer, which will check for violations of this 
 other rules.
 Ok, I thought this to be the case, but wasn't sure. I'll fix the comment.
 In the svga code, I actually couldn't figure out the usage flags when a
 winsys buffer is created. It looks like usage is always 0, except for
 queries it uses SVGA_BUFFER_USAGE_PINNED. Of course, that's not a
 resource but a winsys buffer, but as far as I can tell this ends up in a
 pb_buffer usage flag. Not sure if that's ok or supposed to be like that...
 Jose has looked at this more recently than I have...
 
 pb_buffer sits between pipe driver and the winsys, and needs to pass
 custom buffer flags unmodified from svga to the winsys.
 
 SVGA_BUFFER_USAGE_PINNED is one of those usages.
So the svga winsys buffer_create function takes only custom flags, none
of the PB_USAGE ones? This is the idea I got from the code (plus the
custom flags would clearly overlap with the generic ones), and hence
what I updated the comment to (which clearly was wrong).
I'm not sure though this really works with the pb code I thought it
might do some checks on the usage flags there but if you say it works
then I better believe it...

Roland




--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


[Mesa3d-dev] gallium-resources branch merge

2010-04-02 Thread Roland Scheidegger
I'm planning on merging the gallium-resources branch shortly (after
easter). Due to the amount of code changed, it wouldn't be unexpected if
some drivers break here and there. So it would be nice if the respective
driver authors could take a look at that branch now.

If you've missed the discussion about this branch and what this is
about, here it is:

http://www.mail-archive.com/mesa3d-dev@lists.sourceforge.net/msg12726.html

I've also removed the video interfaces completely, as they weren't
ported to the interface changes and actually some of the video code
missed some earlier interface changes so didn't build anyway. Video
related work should be done on pipe-video branch which had newer stuff
(for video) already.


Roland

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] How do we init half float tables?

2010-04-02 Thread Roland Scheidegger
On 02.04.2010 17:09, Luca Barbieri wrote:
 Additionally, the S3TC library may now support only a subset of the 
 formats. This may be even more useful as further compressed formats 
 are added.

FWIW, I don't see any new s3tc formats. rgtc will not be handled by s3tc
library since it isn't patent encumbered. util_format_is_s3tc will not
include rgtc formats.
(Though I guess that external decoding per-pixel is really rather lame,
should do it per-block...)

Roland

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] [PATCH] glsl: optimize sqrt

2010-03-29 Thread Roland Scheidegger
On 29.03.2010 04:50, Marek Olšák wrote:
  We were talking a bit on IRC that the GLSL compiler implements the sqrt
 function somewhat inefficiently. Instead of rsq+rcp+cmp instructions as
 is in the original code, the proposed patch uses just rsq+mul. Please
 see the patch log for further explanation, and please review.

I'll definitely agree with the mul instead of rcp part, as that should
be more efficient on a lot of modern hardware (rcp usually being part of
some special function block instead of main alu).
As far as I can tell though we still need the cmp unfortunately, since
invsqrt(0) is infinite and multiplying by 0 will give some undefined
result, for IEEE it should be NaN (well depending on hardware I guess,
if you have implementation which clamps infinity to its max
representable number it should be ok). In any case, glsl says invsqrt(0)
is undefined, hence can't rely on this.
Thinking about it, we'd possibly want a SQRT opcode, both in mesa and
tgsi. Because there's actually hardware which can do sqrt (i965
MathBox), and just as importantly because this gives drivers a way to
implement this as invsqrt + mul without the cmp, if they can. For
instance AMD hardware generally has 3 rounding modes for these ops,
IEEE (which gives infinity for invsqrt(0)), DX (clamps to
MAX_FLOAT), and FF (which clamps infinity to 0, exactly what you need
to implement sqrt with a mul and invsqrt and no cmp - though actually it
should work with DX clamping as well).

Roland




 -Marek
 
 
 
 
 From 9b834a79a1819f3b4b9868be3e2696667791c83e Mon Sep 17 00:00:00 2001
 From: =?utf-8?q?Marek=20Ol=C5=A1=C3=A1k?= mar...@gmail.com
 Date: Sat, 27 Mar 2010 13:49:09 +0100
 Subject: [PATCH] glsl: optimize sqrt
 
 The new version can be derived from sqrt as follows:
 
 sqrt(x) =
 sqrt(x)^2 / sqrt(x) =
 x / sqrt(x) =
 x * rsqrt(x)
 
 Also the need for the CMP instruction is gone because there is no division
 by zero.
 ---
  .../shader/slang/library/slang_common_builtin.gc   |   22 +++
  1 files changed, 4 insertions(+), 18 deletions(-)
 
 diff --git a/src/mesa/shader/slang/library/slang_common_builtin.gc 
 b/src/mesa/shader/slang/library/slang_common_builtin.gc
 index a25ca55..3f6596c 100644
 --- a/src/mesa/shader/slang/library/slang_common_builtin.gc
 +++ b/src/mesa/shader/slang/library/slang_common_builtin.gc
 @@ -602,50 +602,36 @@ vec4 exp2(const vec4 a)
  
  float sqrt(const float x)
  {
 -   const float nx = -x;
 float r;
 __asm float_rsq r, x;
 -   __asm float_rcp r, r;
 -   __asm vec4_cmp __retVal, nx, r, 0.0;
 +   __retVal = r * x;
  }
  
  vec2 sqrt(const vec2 x)
  {
 -   const vec2 nx = -x, zero = vec2(0.0);
 vec2 r;
 __asm float_rsq r.x, x.x;
 __asm float_rsq r.y, x.y;
 -   __asm float_rcp r.x, r.x;
 -   __asm float_rcp r.y, r.y;
 -   __asm vec4_cmp __retVal, nx, r, zero;
 +   __retVal = r * x;
  }
  
  vec3 sqrt(const vec3 x)
  {
 -   const vec3 nx = -x, zero = vec3(0.0);
 vec3 r;
 __asm float_rsq r.x, x.x;
 __asm float_rsq r.y, x.y;
 __asm float_rsq r.z, x.z;
 -   __asm float_rcp r.x, r.x;
 -   __asm float_rcp r.y, r.y;
 -   __asm float_rcp r.z, r.z;
 -   __asm vec4_cmp __retVal, nx, r, zero;
 +   __retVal = r * x;
  }
  
  vec4 sqrt(const vec4 x)
  {
 -   const vec4 nx = -x, zero = vec4(0.0);
 vec4 r;
 __asm float_rsq r.x, x.x;
 __asm float_rsq r.y, x.y;
 __asm float_rsq r.z, x.z;
 __asm float_rsq r.w, x.w;
 -   __asm float_rcp r.x, r.x;
 -   __asm float_rcp r.y, r.y;
 -   __asm float_rcp r.z, r.z;
 -   __asm float_rcp r.w, r.w;
 -   __asm vec4_cmp __retVal, nx, r, zero;
 +   __retVal = r * x;
  }
  
  
 
 
 
 
 --
 Download Intel#174; Parallel Studio Eval
 Try the new software tools for yourself. Speed compiling, find bugs
 proactively, and fine-tune applications for parallel performance.
 See why Intel Parallel Studio got high marks during beta.
 http://p.sf.net/sfu/intel-sw-dev
 
 
 
 
 ___
 Mesa3d-dev mailing list
 Mesa3d-dev@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] Mesa (mesa_7_7_branch): mesa: List Quake3 extensions first.

2010-03-16 Thread Roland Scheidegger
On 16.03.2010 18:52, Keith Whitwell wrote:
 On Tue, 2010-03-16 at 08:32 -0700, Ian Romanick wrote:
 
 I'm also a bit surprised that not detecting GL_EXT_compiled_vertex_array
 has any impact on our Quake3 performance.  After all, our CVA
 implementation doesn't do anything!  Looking at the list, it seems more
 likely that GL_EXT_texture_env_add is the problem.  Not having that will
 cause Quake3 to use additional rendering passes in quite a few cases.
 
 I think if CVA isn't present, it falls back to glVertex() and friends...
Bad...

I'm not sure though listing that extension first really solves all
problems. There's a quite famous bug when you bring up the information
with the extension string it'll actually segfault. I think though that
got fixed in later versions (though I don't know how, if by just copying
only the first n bytes of the extension string it obviously wouldn't
solve the problem that it doesn't recognize the CVA extension...). And
against this you can't really do anything other than app detection and
cut the string appropriately...

Roland

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


[Mesa3d-dev] extensions supported or not in gallium

2010-03-11 Thread Roland Scheidegger
Hi,

there are currently a couple of extensions enabled in the mesa state
tracker which probably shouldn't be. These were moved there by commit
a0ae2ca033ec2024da1e01d1c11c0437837c031b (that is with dri they were
already always enabled before).

Someone knows off-hand which one we can enable or not? I'm going to kill
off EXT_cull_vertex and  TDFX_texture_compression_FXT1, clearly we can't
handle them.
The others in question are
ARB_window_pos
APPLE_client_storage
MESA_pack_invert
NV_vertex_program
NV_vertex_program1_1

(the latter two IIRC the problem was that regs needed to be
zero-initialized)

Currently gallium dri drivers also have ARB_imaging enabled too (via
driInitExtensions()), I think that's not correct neither.

Roland

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] extensions supported or not in gallium

2010-03-11 Thread Roland Scheidegger
On 11.03.2010 17:54, Brian Paul wrote:
 Roland Scheidegger wrote:
 Hi,

 there are currently a couple of extensions enabled in the mesa state
 tracker which probably shouldn't be. These were moved there by commit
 a0ae2ca033ec2024da1e01d1c11c0437837c031b (that is with dri they were
 already always enabled before).

 Someone knows off-hand which one we can enable or not? I'm going to kill
 off EXT_cull_vertex and  TDFX_texture_compression_FXT1, clearly we can't
 handle them.
 The others in question are
 ARB_window_pos
 
 handled in core mesa.
 
 
 APPLE_client_storage
 
 should not be enabled by default.
 
 
 MESA_pack_invert
 
 handled by core mesa.
 
 
 NV_vertex_program
 NV_vertex_program1_1

 (the latter two IIRC the problem was that regs needed to be
 zero-initialized)
 
 There may be other issues too.  Someone would have to enable the 
 extension(s) and do some testing.
 
 
 Currently gallium dri drivers also have ARB_imaging enabled too (via
 driInitExtensions()), I think that's not correct neither.
 
 Yeah, I think that needs to be disabled.

Ok thanks I've pushed a fix.

Roland

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] Mesa (master): util: Code generate functions to pack and unpack a single pixel.

2010-03-08 Thread Roland Scheidegger
On 07.03.2010 01:21, José Fonseca wrote:
 On Sat, 2010-03-06 at 05:44 -0800, Brian Paul wrote:
 On Sat, Mar 6, 2010 at 5:44 AM, José Fonseca jfons...@vmware.com wrote:
 On Mon, 2010-03-01 at 09:03 -0800, Michel Dänzer wrote:
 On Fri, 2010-02-26 at 08:47 -0800, Jose Fonseca wrote:
 Module: Mesa
 Branch: master
 Commit: 9beb302212a2afac408016cbd7b93c8b859e4910
 URL:
 http://cgit.freedesktop.org/mesa/mesa/commit/?id=9beb302212a2afac408016cbd7b93c8b859e4910

 Author: José Fonseca jfons...@vmware.com
 Date:   Fri Feb 26 16:45:22 2010 +

 util: Code generate functions to pack and unpack a single pixel.

 Should work correctly for all pixel formats except SRGB formats.

 Generated code made much simpler by defining the pixel format as
 a C structure. For example this is the generated structure for
 PIPE_FORMAT_B6UG5SR5S_NORM:

 union util_format_b6ug5sr5s_norm {
uint16_t value;
struct {
   int r:5;
   int g:5;
   unsigned b:6;
} chan;
 };
 José, are you aware that the memory layout of bitfields is mostly
 implementation dependent? IME this makes them mostly unusable for
 modelling hardware in a portable manner.
 It's not only implementation dependent and slow -- it is also buggy!

 gcc-4.4.3 is doing something very fishy to single bit fields.

 See the attached code. ff ff ff ff is expected, but ff ff ff 01 is
 printed with gcc-4.4.3. Even without any optimization. gcc-4.3.4 works
 fine.

 Am I missing something or is this effectively a bug?
 Same result with gcc 4.4.1.

 If pixel.chan.a is put into a temporary int var followed by the
 scaling arithmetic it comes out as expected.  Looks like a bug to me.
 
 Thanks. I'll submit a bug report then.
 
 BTW, it looks like sizeof(union util_format_b5g5r5a1_unorm) == 4, not 2.
 
 Yet another reason to stay away from bit fields..

Hmm, might be because the bitfields are of type unsigned, not uint16_t?

I've no idea though neither why it would return 01 and not ff.

Roland

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] dri-extension branch - clean up advertising extensions in Gallium

2010-03-08 Thread Roland Scheidegger
On 07.03.2010 20:26, Marek Olšák wrote:
  This branch is aimed to address the following issues:
 * Extensions are advertised in both st/mesa and st/dri, doing the same
 thing in two places.
 * The inability to disable extensions in pipe_screen::get_param because
 st/dri overrides the decisions of st/mesa.
 
 Here's the branch:
 http://cgit.freedesktop.org/~mareko/mesa/log/?h=dri-extensions
 
 The first commit moves the differences between st/dri and st/mesa to the
 latter and removes dri_init_extensions from st/dri. It doesn't remove
 any extensions from the list except for those not advertised by pipe_screen.
 
 The second commit enables texture_rectangle by default in Gallium. To my
 knowledge any Gallium hardware can do this and I suspect it was
 dependent on NPOT textures by accident.
 
 All this is of course tested with piglit and glean.
 
 Please review. In case it's not OK, please let me know what needs to be
 done.

The second commit looks fine to me.
The first one, I'm not sure. Maybe that's ok, but if so I'm wondering
why, since this skips all the mapping business driInitExtensions did and
just sets the extension enable bits to true. At least I'm fairly sure it
was needed in the past...

Roland

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] dri-extension branch - clean up advertising extensions in Gallium

2010-03-08 Thread Roland Scheidegger
On 08.03.2010 14:22, Joakim Sindholt wrote:
 On Mon, 2010-03-08 at 13:16 +0100, Roland Scheidegger wrote:
 On 07.03.2010 20:26, Marek Olšák wrote:
  This branch is aimed to address the following issues:
 * Extensions are advertised in both st/mesa and st/dri, doing the same
 thing in two places.
 * The inability to disable extensions in pipe_screen::get_param because
 st/dri overrides the decisions of st/mesa.

 Here's the branch:
 http://cgit.freedesktop.org/~mareko/mesa/log/?h=dri-extensions

 The first commit moves the differences between st/dri and st/mesa to the
 latter and removes dri_init_extensions from st/dri. It doesn't remove
 any extensions from the list except for those not advertised by pipe_screen.

 The second commit enables texture_rectangle by default in Gallium. To my
 knowledge any Gallium hardware can do this and I suspect it was
 dependent on NPOT textures by accident.

 All this is of course tested with piglit and glean.

 Please review. In case it's not OK, please let me know what needs to be
 done.
 The second commit looks fine to me.
 The first one, I'm not sure. Maybe that's ok, but if so I'm wondering
 why, since this skips all the mapping business driInitExtensions did and
 just sets the extension enable bits to true. At least I'm fairly sure it
 was needed in the past...

 Roland
 
 I believe airlied pointed out earlier that
 http://cgit.freedesktop.org/mesa/mesa/commit/?id=17ef1f6074d6107c167f1956a5c60993904c0b72
  fixed that problem.

But even with that commit, all drivers still call driInitExtensions at
least once, though the parameter list can be NULL. I don't see that
happening here.

Roland


--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] dri-extension branch - clean up advertising extensions in Gallium

2010-03-08 Thread Roland Scheidegger
Otherwise, looks good to me, but I'd prefer if someone more familiar
with the extension handling code could give it a look.

Roland

On 08.03.2010 17:03, Marek Olšák wrote:
 Alright, I will add driInitExtensions(ctx, NULL, TRUE) at the end of
 st_init_extensions. Anything else I missed or is it OK?
 
 -Marek
 
 On Mon, Mar 8, 2010 at 4:25 PM, Roland Scheidegger srol...@vmware.com
 mailto:srol...@vmware.com wrote:
 
 On 08.03.2010 14:22, Joakim Sindholt wrote:
  On Mon, 2010-03-08 at 13:16 +0100, Roland Scheidegger wrote:
  On 07.03.2010 20:26, Marek Olšák wrote:
   This branch is aimed to address the following issues:
  * Extensions are advertised in both st/mesa and st/dri, doing
 the same
  thing in two places.
  * The inability to disable extensions in pipe_screen::get_param
 because
  st/dri overrides the decisions of st/mesa.
 
  Here's the branch:
  http://cgit.freedesktop.org/~mareko/mesa/log/?h=dri-extensions
 http://cgit.freedesktop.org/%7Emareko/mesa/log/?h=dri-extensions
 
  The first commit moves the differences between st/dri and
 st/mesa to the
  latter and removes dri_init_extensions from st/dri. It doesn't
 remove
  any extensions from the list except for those not advertised by
 pipe_screen.
 
  The second commit enables texture_rectangle by default in
 Gallium. To my
  knowledge any Gallium hardware can do this and I suspect it was
  dependent on NPOT textures by accident.
 
  All this is of course tested with piglit and glean.
 
  Please review. In case it's not OK, please let me know what
 needs to be
  done.
  The second commit looks fine to me.
  The first one, I'm not sure. Maybe that's ok, but if so I'm wondering
  why, since this skips all the mapping business driInitExtensions
 did and
  just sets the extension enable bits to true. At least I'm fairly
 sure it
  was needed in the past...
 
  Roland
 
  I believe airlied pointed out earlier that
 
 
 http://cgit.freedesktop.org/mesa/mesa/commit/?id=17ef1f6074d6107c167f1956a5c60993904c0b72
 fixed that problem.
 
 But even with that commit, all drivers still call driInitExtensions at
 least once, though the parameter list can be NULL. I don't see that
 happening here.
 
 Roland
 
 


--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] dri-extension branch - clean up advertising extensions in Gallium

2010-03-08 Thread Roland Scheidegger
Well I guess another solution would be to just call it directly from the
place the dri_extennsion code initially was, i.e. in dri_create_context.

Roland

On 08.03.2010 17:21, Jakob Bornecrantz wrote:
 Calling dri code from src/mesa/state_tracker is not allowed since it  
 supposed to be independent of windowing systems. That said from what I  
 can see both driInitExtensions and driInitSignleExtension could be  
 folded into mesa core, I can't see anything dri special about them.
 
 Cheers Jakob.
 
 On 8 mar 2010, at 16.12, Roland Scheidegger wrote:
 Otherwise, looks good to me, but I'd prefer if someone more familiar
 with the extension handling code could give it a look.

 Roland

 On 08.03.2010 17:03, Marek Olšák wrote:
 Alright, I will add driInitExtensions(ctx, NULL, TRUE) at the end of
 st_init_extensions. Anything else I missed or is it OK?

 -Marek

 On Mon, Mar 8, 2010 at 4:25 PM, Roland Scheidegger  
 srol...@vmware.com
 mailto:srol...@vmware.com wrote:

On 08.03.2010 14:22, Joakim Sindholt wrote:
 On Mon, 2010-03-08 at 13:16 +0100, Roland Scheidegger wrote:
 On 07.03.2010 20:26, Marek Olšák wrote:
 This branch is aimed to address the following issues:
 * Extensions are advertised in both st/mesa and st/dri, doing
the same
 thing in two places.
 * The inability to disable extensions in pipe_screen::get_param
because
 st/dri overrides the decisions of st/mesa.

 Here's the branch:
 http://cgit.freedesktop.org/~mareko/mesa/log/?h=dri-extensions
http://cgit.freedesktop.org/%7Emareko/mesa/log/?h=dri-extensions
 The first commit moves the differences between st/dri and
st/mesa to the
 latter and removes dri_init_extensions from st/dri. It doesn't
remove
 any extensions from the list except for those not advertised by
pipe_screen.
 The second commit enables texture_rectangle by default in
Gallium. To my
 knowledge any Gallium hardware can do this and I suspect it was
 dependent on NPOT textures by accident.

 All this is of course tested with piglit and glean.

 Please review. In case it's not OK, please let me know what
needs to be
 done.
 The second commit looks fine to me.
 The first one, I'm not sure. Maybe that's ok, but if so I'm  
 wondering
 why, since this skips all the mapping business driInitExtensions
did and
 just sets the extension enable bits to true. At least I'm fairly
sure it
 was needed in the past...

 Roland
 I believe airlied pointed out earlier that


 http://cgit.freedesktop.org/mesa/mesa/commit/?id=17ef1f6074d6107c167f1956a5c60993904c0b72
fixed that problem.

But even with that commit, all drivers still call  
 driInitExtensions at
least once, though the parameter list can be NULL. I don't see  
 that
happening here.

Roland



 --
 Download Intel#174; Parallel Studio Eval
 Try the new software tools for yourself. Speed compiling, find bugs
 proactively, and fine-tune applications for parallel performance.
 See why Intel Parallel Studio got high marks during beta.
 http://p.sf.net/sfu/intel-sw-dev
 ___
 Mesa3d-dev mailing list
 Mesa3d-dev@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
 


--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] RFC: gallium-format-cleanup branch (was Gallium format swizzles)

2010-03-03 Thread Roland Scheidegger
On 03.03.2010 14:07, José Fonseca wrote:
 On Wed, 2010-03-03 at 04:27 -0800, Luca Barbieri wrote:
 PIPE_FORMAT_X8B8G8R8_UNORM is being used by mesa.
 PIPE_FORMAT_R8G8B8X8_UNORM doesn't exist hence it appears to be
 unnecessary. So it doesn't make sense to rename.
 How about D3DFMT_X8B8G8R8? That should map to
 PIPE_FORMAT_R8G8B8X8_UNORM.
 
 Yes, you're right.
 
 BTW, we are also missing D3DFMT_X4R4G4B4, D3DFMT_X1R5G5B5, 
 D3DFMT_A4L4, D3DFMT_A1, D3DFMT_L6V5U5, D3DFMT_D15S1,
 D3DFMT_D24X4S4, D3DFMT_CxV8U8 and perhaps others I did not notice.
 
 D3DFMT_L6V5U5 is there (PIPE_FORMAT_R5SG5SB6U_NORM). The others are 
 indeed missing. Neither of the mentioned formats is required for D3D9
  conformance, but we could add them to gallium.
 
 D3DFMT_A1 is special: it has less than 1 byte per pixel. Probably the
  best way to support it would be to treat it as a 8x1 macro pixel,
 8bits, similarly to compressed formats.
 
 D3DFMT_CxV8U8 too as special semantics.

And not only are those formats optional, some would be completely
pointless in gallium (D15S1, D24X4S4). There's simply no modern hardware
which supports 1 bit stencil (I think pretty much the only chip
supporting that was savage3d), nor 4 bit stencil (can't remember
off-hand any chip supporting that, maybe some of the then professional
chips did). The others sound a bit more plausible and hardware may
support them, but I'm not sure they are really missed (A4L4, X4R4G4B4,
X1R5G5B5). As José said, CxV8U8 isn't really a format only, and we'll
need to add 1-bit format for DX10.

Roland

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] Does DX9 SM3 - VMware svga with arbitrary semantics work? How?

2010-03-03 Thread Roland Scheidegger
On 03.03.2010 20:23, Luca Barbieri wrote:
 And never will...  It does not export PIPE_CAP_GLSL, and does not have
 the shader opcodes to ever do so.
 
 Any Gallium driver should be able to support the GLSL subset without
 control flow.
 
 And if we had a proper optimization infrastructure capable of inlining
 functions, converting conditionals to multiplications and unrolling
 loops (e.g. look at what the nVidia Cg compiler does), then
 essentially all GLSL could be supported on any driver, with only
 limitations on the maximum number of loop iterations.
 
 Isn't it worth supporting that?
 
 BTW, proprietary drivers do this: for instance nVidia supports GLSL on
 nv30, which can't do control flow in fragment shaders and doesn't
 support SM3.

I think the i915 is a lot closer to r300 in that regard (which is quite
a bit more limited than nv30), and it's true that ATI also supported
glsl on that. As far as I know though it was quite easy to bump into
shaders which wouldn't compile. There's only so much you can do if you
have 4 blocks of (max) 16 instructions to run without any control flow
if you need to unroll loops, not to mention lacking instructions for
derivatives, or the fact things like sin/cos will take quite a few
instructions...
nv30, while processing fragment shaders slowly, had a LOT higher
instruction count, IIRC supported derivatives and predication and had no
dependent texturing limit. So that makes it a lot better suited for glsl
hacks.
So, I'm not sure it really makes a whole lot of sense to support glsl on
i915. It'll really only ever work for very simple things (granted there
are apps out there which indeed will only use glsl shaders which are
known to compile fine on r300...)

Roland

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] [RFC] gallium-vertexelementcso branch merge

2010-03-02 Thread Roland Scheidegger
On 02.03.2010 11:37, Keith Whitwell wrote:
 On Mon, 2010-03-01 at 10:02 -0800, Roland Scheidegger wrote:
 Hi,

 this branch turns vertex element into a cso, so instead of
 set_vertex_elements there's now the triad of
 create/bind/delete_vertex_elements_state. I have converted all the
 drivers except nouveau (I didn't do it because Christoph Bumiller
 already did nv50, but I can give the rest of them a shot), though that
 doesn't necessarily mean they are optimized for it (the idea is of
 course to precalculate state on create, not just copy the pipe structs
 and do everything on bind) - only i965g really does something close to
 it (though still emits the state always). Drivers doing both hw vertex
 shaders and using draw in some circumstances of course will have to
 store both representations on create.
 Also note that util_draw_vertex_buffer semantics have changed a bit
 (caller needs to set vertex element state, which is a bit odd).
 
 Roland,
 
 The branch looks good to me, happy to see it merged when you're ready to
 go.

There's actually something in the cso code I was a bit unsure about,
I've looked at it again and indeed it seems wrong. The problem is that
the count value itself isn't stored for the comparison. So in the
unlikely case the hash value is the same for pipe_vertex_elements with
different counts, the comparison itself will also be the same as long as
the first few elements are identical. Which seems very wrong. The
easiest way to fix would probably be to just store the count alongside
the pipe_vertex_element data, but that would need an additional copy of
the incoming data in cso_set_vertex_elements. Hmm...

Roland

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


[Mesa3d-dev] [RFC] gallium-vertexelementcso branch merge

2010-03-01 Thread Roland Scheidegger
Hi,

this branch turns vertex element into a cso, so instead of
set_vertex_elements there's now the triad of
create/bind/delete_vertex_elements_state. I have converted all the
drivers except nouveau (I didn't do it because Christoph Bumiller
already did nv50, but I can give the rest of them a shot), though that
doesn't necessarily mean they are optimized for it (the idea is of
course to precalculate state on create, not just copy the pipe structs
and do everything on bind) - only i965g really does something close to
it (though still emits the state always). Drivers doing both hw vertex
shaders and using draw in some circumstances of course will have to
store both representations on create.
Also note that util_draw_vertex_buffer semantics have changed a bit
(caller needs to set vertex element state, which is a bit odd).

Roland

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] [RFC] gallium-vertexelementcso branch merge

2010-03-01 Thread Roland Scheidegger
On 01.03.2010 19:02, Roland Scheidegger wrote:
 Hi,
 
 this branch turns vertex element into a cso, so instead of
 set_vertex_elements there's now the triad of
 create/bind/delete_vertex_elements_state. I have converted all the
 drivers except nouveau (I didn't do it because Christoph Bumiller
 already did nv50, but I can give the rest of them a shot), though that
 doesn't necessarily mean they are optimized for it (the idea is of
 course to precalculate state on create, not just copy the pipe structs
 and do everything on bind) - only i965g really does something close to
 it (though still emits the state always). Drivers doing both hw vertex
 shaders and using draw in some circumstances of course will have to
 store both representations on create.
 Also note that util_draw_vertex_buffer semantics have changed a bit
 (caller needs to set vertex element state, which is a bit odd).

Ok, I've converted nv30/nv40 too. Not that they'd precalculate any hw
state...

Roland

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] [RFC] gallium-vertexelementcso branch merge

2010-03-01 Thread Roland Scheidegger
On 02.03.2010 00:18, Joakim Sindholt wrote:
 On Mon, 2010-03-01 at 19:02 +0100, Roland Scheidegger wrote:
 Hi,

 this branch turns vertex element into a cso, so instead of
 set_vertex_elements there's now the triad of
 create/bind/delete_vertex_elements_state. I have converted all the
 drivers except nouveau (I didn't do it because Christoph Bumiller
 already did nv50, but I can give the rest of them a shot), though that
 doesn't necessarily mean they are optimized for it (the idea is of
 course to precalculate state on create, not just copy the pipe structs
 and do everything on bind) - only i965g really does something close to
 it (though still emits the state always). Drivers doing both hw vertex
 shaders and using draw in some circumstances of course will have to
 store both representations on create.
 Also note that util_draw_vertex_buffer semantics have changed a bit
 (caller needs to set vertex element state, which is a bit odd).

 Roland
 
 Can I still do things like:
 element 0: - vbo 5
 element 1: - vbo 2
 and then set_vertex_buffers() with an array { zeros, zeros, vbo 2, zeros
 zeros, vbo 5 } ?
 

The branch doesn't change pipe_vertex_element itself (except
nr_components got removed as that's really derived from the associated
pipe_format), only how those vertex elements are set. Hence you can do
exactly the same things you could do before. Though I'm not quite sure
what your zeros mean, if that's just a unused vbo it should be ok, but
it is probably not ok to just pass in a null pointer for a unused
pipe_vertex_buffer.

Roland

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] move normalized texel coordinates bit to sampler view

2010-02-25 Thread Roland Scheidegger
On 25.02.2010 18:39, michal wrote:
 Roland Scheidegger wrote on 2010-02-24 15:18:
 On 24.02.2010 12:48, Christoph Bumiller wrote:
   
 This wasn't a problem before because textures and samplers were
 linked 1:1, but in view of the gallium-gpu4-texture-opcodes branch,
 this coordinate normalization bit becomes a problem.

 NV50 hardware has that bit in the RESOURCE binding, and not the
 SAMPLER binding, and you can imagine that this will lead to us having
 to jump through a few annoying looking hoops to accomodate.

 As far as I can see, neither D3D10 nor D3D11 nor OpenGL nor CUDA have
 sampler states that are decoupled from the texture, and which contain
 a normalized coordinates bit, so it's worth considering not having it there
 in gallium either.

 For OpenGL, unnormalized coordinates are only used for RECT textures,
 and in this case it makes sense to make it a property of the texture.
 
 I agree this is not sampler state, but I don't quite agree this should
 be texture state.
 This changes how texture coordinates get interpreted in the interpolator
 - in that sense it is similar to the cylindrical texture coord wrap
 which we moved away from sampler state recently. This one got moved to
 shader declaration. I wonder if the normalization bit should be treated
 the same.
 Though OTOH you're quite right that in OpenGL this really is texture
 property (it is a different texture target after all), and afaik d3d
 doesn't support non-normalized coords (?). Hmm...

   
 Isn't it the case that for RECT targets we clear the bit, and for others 
 we always set it?
 
 In mesa st I see:
 
  if (texobj-Target != GL_TEXTURE_RECTANGLE_ARB)
 sampler-normalized_coords = 1;
 
 By definition, RECT texture with normalised coordinates is just an NPOT. 
 If we removed this apparently redundant flag, would that make nouveau 
 developers life easier?
But we don't have rect targets in gallium hence we need the flag. I
think conceptually this makes sense since for texture layouts etc.
drivers won't care one bit if this is 2d npot or rect texture.
Though I guess introducing rect targets instead would be another option.

Roland


 
   
 And, finally, I've seen you reverted the changes for independent image
 and sampler index in the texture opcodes. What's up with that ?
 Is the code not nice enough, or has the idea been discarded and by problem
 disappears ?

 
 
 Please consider this branch dead. It will be easier for me to introduce 
 new, optional sampler and fetch opcodes a'la GL 3.0. There's just too 
 much code to fix and test and we still want the older hardware not to 
 stand on its head to try and translate back to old model.
 
 Thanks.


--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] move normalized texel coordinates bit to sampler view

2010-02-24 Thread Roland Scheidegger
On 24.02.2010 12:48, Christoph Bumiller wrote:
 This wasn't a problem before because textures and samplers were
 linked 1:1, but in view of the gallium-gpu4-texture-opcodes branch,
 this coordinate normalization bit becomes a problem.
 
 NV50 hardware has that bit in the RESOURCE binding, and not the
 SAMPLER binding, and you can imagine that this will lead to us having
 to jump through a few annoying looking hoops to accomodate.
 
 As far as I can see, neither D3D10 nor D3D11 nor OpenGL nor CUDA have
 sampler states that are decoupled from the texture, and which contain
 a normalized coordinates bit, so it's worth considering not having it there
 in gallium either.
 
 For OpenGL, unnormalized coordinates are only used for RECT textures,
 and in this case it makes sense to make it a property of the texture.

I agree this is not sampler state, but I don't quite agree this should
be texture state.
This changes how texture coordinates get interpreted in the interpolator
- in that sense it is similar to the cylindrical texture coord wrap
which we moved away from sampler state recently. This one got moved to
shader declaration. I wonder if the normalization bit should be treated
the same.
Though OTOH you're quite right that in OpenGL this really is texture
property (it is a different texture target after all), and afaik d3d
doesn't support non-normalized coords (?). Hmm...

Roland



 
 And, finally, I've seen you reverted the changes for independent image
 and sampler index in the texture opcodes. What's up with that ?
 Is the code not nice enough, or has the idea been discarded and by problem
 disappears ?
 
 Best regards,
 Christoph
 
 
 --
 Download Intel#174; Parallel Studio Eval
 Try the new software tools for yourself. Speed compiling, find bugs
 proactively, and fine-tune applications for parallel performance.
 See why Intel Parallel Studio got high marks during beta.
 http://p.sf.net/sfu/intel-sw-dev
 ___
 Mesa3d-dev mailing list
 Mesa3d-dev@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] [PATCH] st/dri: don't enable EXT_draw_buffers2 by default

2010-02-22 Thread Roland Scheidegger
Marek,

I don't particularly like that patch, because it doesn't really fix the
problem with the extension handling.
There are lots of extension listed there which should not be advertized
by default, so picking one out won't fix the others.
I think they are there because driInitExtensions definitely does more
than just set ctx-Extensions_foo_bar to enable.
Other extensions in this list which are queried by CAP bits but still
show up in the extension string regardless are the glsl ones
(ARB_fragment_shader and friends), a couple texture address modes
(mirrored_repeat, mirror_clamp), blend_equation_separate, technically
even ARB_multitexture (though we probably should skip the test for more
than 1 texture unit and always set that to true in st_extensions.c),
two-sided stencil, occlusion queries, anisotropic filtering, ycbcr
textures, packed depth stencil (there may be more that was just from a
quick look).
So if it's ok to remove them all from that list this should be done, but
I fear it's not ok and the fix needs to be a bit more complicated (see
comments in dri_init_extensions).

Roland







On 21.02.2010 16:00, Marek Olšák wrote:
  Hi,
 
 the attached patch modifies st/dri to not enable EXT_draw_buffers2 by
 default because r300g and most probably even some other drivers can't
 support this extension. The drivers reporting support of
 PIPE_CAP_INDEP_BLEND_ENABLE are not affected by this patch.
 
 Please review.
 
 Marek
 
 
 
 
 From ddda2c19b74780263f848ffafe10809bd6385d01 Mon Sep 17 00:00:00 2001
 From: =?utf-8?q?Marek=20Ol=C5=A1=C3=A1k?= mar...@gmail.com
 Date: Sun, 21 Feb 2010 01:27:09 +0100
 Subject: [PATCH 2/2] st/dri: don't enable EXT_draw_buffers2 by default
 
 ---
  src/gallium/state_trackers/dri/dri_extensions.c |1 -
  1 files changed, 0 insertions(+), 1 deletions(-)
 
 diff --git a/src/gallium/state_trackers/dri/dri_extensions.c 
 b/src/gallium/state_trackers/dri/dri_extensions.c
 index 1259813..7f8ceef 100644
 --- a/src/gallium/state_trackers/dri/dri_extensions.c
 +++ b/src/gallium/state_trackers/dri/dri_extensions.c
 @@ -99,7 +99,6 @@ static const struct dri_extension card_extensions[] = {
 {GL_EXT_blend_minmax, GL_EXT_blend_minmax_functions},
 {GL_EXT_blend_subtract, NULL},
 {GL_EXT_cull_vertex, GL_EXT_cull_vertex_functions},
 -   {GL_EXT_draw_buffers2, GL_EXT_draw_buffers2_functions},
 {GL_EXT_fog_coord, GL_EXT_fog_coord_functions},
 {GL_EXT_framebuffer_object, GL_EXT_framebuffer_object_functions},
 {GL_EXT_multi_draw_arrays, GL_EXT_multi_draw_arrays_functions},
 
 
 
 
 --
 Download Intel#174; Parallel Studio Eval
 Try the new software tools for yourself. Speed compiling, find bugs
 proactively, and fine-tune applications for parallel performance.
 See why Intel Parallel Studio got high marks during beta.
 http://p.sf.net/sfu/intel-sw-dev
 
 
 
 
 ___
 Mesa3d-dev mailing list
 Mesa3d-dev@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] Mesa (master): r300g: remove L8_UNORM from colorbuffer formats

2010-02-19 Thread Roland Scheidegger
This isn't actually true any more. See issue (9) of
ARB_framebuffer_object which defines luminance, luminance_alpha and
intensity as renderable.
(I'm not quite sure how color assignment is done, readpixels and the
like would define L = R + G + B, but I think it will follow the table
from texture image specification instead, hence L = R, I = R).
You are quite right though this is a recent addition, and in fact for
instance i965 can't render to this neither (it can render to red or
alpha formats, but none of the l/i formats) directly, neither can r300
(without shader hacking).

Roland


On 19.02.2010 15:35, Marek Olšák wrote:
 I still think st/xorg should use R8, which is well defined as to which
 component to store, rather than L8. That's also the reason L8 is not
 renderable in OpenGL.
 
 
 2010/2/19 Corbin Simpson mostawesomed...@gmail.com
 mailto:mostawesomed...@gmail.com
 
 Yeah, I would have nak'd this. Will revert when I get home.
 
 Posting from a mobile, pardon my terseness. ~ C.
 
 On Feb 19, 2010 12:56 AM, Michel Dänzer mic...@daenzer.net
 mailto:mic...@daenzer.net wrote:

 On Thu, 2010-02-18 at 19:24 -0800, Marek Olk wrote:
  Module: Mesa
  Branch: master
  Commit: fc427d23439a2702068209957f08990ea29fe21b
  URL:  
  
 http://cgit.freedesktop.org/mesa/mesa/commit/?id=fc427d23439a2702068209957f08990ea29fe21b
 
  Author: Marek Olšák mar...@gmail.com mailto:mar...@gmail.com
  Date:   Fri Feb 19 04:23:06 2010 +0100
 
  r300g: remove L8_UNORM from colorbuffer formats
 
  Not renderable in OpenGL anyway.

 The Xorg state tracker uses it though.


 --
 Earthling Michel Dänzer   |  
  http://www.vmware.com
 Libre software enthusiast |  Debian, X and DRI
 developer

 
 --
 Download Intel#174; Parallel Studio Eval
 Try the new software tools for yourself. Speed compiling, find bugs
 proactively, and fine-tune applications for parallel performance.
 See why Intel Parallel Studio got high marks during beta.
 http://p.sf.net/sfu/intel-sw-dev
 ___
 Mesa3d-dev mailing list
 Mesa3d-dev@lists.sourceforge.net
 mailto:Mesa3d-dev@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
 
 
 
 
 
 --
 Download Intel#174; Parallel Studio Eval
 Try the new software tools for yourself. Speed compiling, find bugs
 proactively, and fine-tune applications for parallel performance.
 See why Intel Parallel Studio got high marks during beta.
 http://p.sf.net/sfu/intel-sw-dev
 
 
 
 
 ___
 Mesa3d-dev mailing list
 Mesa3d-dev@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] Mesa (master): util: Fix descriptors for R32_FLOAT and R32G32_FLOAT formats .

2010-02-12 Thread Roland Scheidegger
On 12.02.2010 14:44, michal wrote:
 Keith Whitwell wrote on 2010-02-12 14:28:
 On Fri, 2010-02-12 at 05:09 -0800, michal wrote:
   
 Keith Whitwell wrote on 2010-02-12 13:39:
 
 On Fri, 2010-02-12 at 04:32 -0800, Micha?? Kr??l wrote:
   
   
 Module: Mesa
 Branch: master
 Commit: aa0b671422880b99dc178d43d1e4e1a3f766bf7f
 URL:
 http://cgit.freedesktop.org/mesa/mesa/commit/?id=aa0b671422880b99dc178d43d1e4e1a3f766bf7f

 Author: Michal Krol mic...@vmware.com
 Date:   Fri Feb 12 13:32:35 2010 +0100

 util: Fix descriptors for R32_FLOAT and R32G32_FLOAT formats.
 
 
 Michal,

 Is this more like two different users expecting two different results in
 those unused columns?

 In particular, we definitely require the missing elements to be extended
 to (0,0,0,1) when fetching vertex data, and probably also in OpenGL
 texture sampling (if we supported these formats for that).  

   
   
 Gallium should follow D3D rules, so I've been following D3D here. Also, 
 util_unpack_color_ub() in u_pack_color.h already sets the remaining 
 fields to 0xff.

 Note that D3D doesn't have the problem with expanding vertex attribute 
 data since you can't have X or XY vertex positions, only XYZ (with W 
 extended to 1 as in GL) and XYZW.
 
 But surely D3D permits two-component texture coordinates, which would be
 PIPE_FORMAT_R32G32_FLOAT, and expanded as (r,g,0,1)...

   
 Brian added a table of differences between GL and other APIs recently to
 gallium/docs - does your change agree with that?

   
   
 Where's that exactly, I can't find it?
 
 It seems like we'd want to be able to support both usages - the
 alternative in texture sampling would be forcing the state tracker to
 generate varients of the shader when 2-component textures are bound.  I
 would say that's an unreasonable requirement on the state tracker.

 It seems like in GL would want (0,0,0,1) expansion everywhere, but D3D
 would want differing expansions in different parts of the pipeline.
 That indicates a single flag in the context somewhere isn't sufficient
 to choose between the two.
  
 Maybe there need to be two versions of these PIPE_FORMAT_ enums to
 capture the different values in the missing components?

 EG:

PIPE_FORMAT_R32G32_0001_FLOAT
PIPE_FORMAT_R32G32__FLOAT

 ? or something along those lines??

   
 
 You are right.
 
 Alternatively, follow the more sane API (GL apparently), assume 0001 as 
 default and use the  infix to override.

Note it's not just GL. D3D10 uses same expansion. Only D3D9 is
different. Well for texture sampling anyway, I don't know what d3d does
for vertex formats.

Though for most hardware it would make sense to have only one format per
different expansion, and use some swizzling parameter for sampling,
because that's actually how the hardware works. But not all drivers will
be able to do this, unfortunately.
(Note that for instance, with i965, those two R32G32 formats mentioned
here aren't really freely selectable. In OGL/DX10 mode you'll get the
former, in d3d9 mode you get the latter. You can switch the mode but
you'll also get different border color interpretation along with it -
something which is also not specified in gallium, though I guess you
could say this is tied to gl_rasterization_rules - maybe we could say
the same too, R32G32 is rg01 with gl_rasterization_rules and rg11
without? Seems a bit hackish, though.).

Roland

--
SOLARIS 10 is the OS for Data Centers - provides features such as DTrace,
Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW
http://p.sf.net/sfu/solaris-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] Mesa (master): util: Fix descriptors for R32_FLOAT and R32G32_FLOAT formats .

2010-02-12 Thread Roland Scheidegger
On 12.02.2010 18:42, Keith Whitwell wrote:
 On Fri, 2010-02-12 at 09:28 -0800, José Fonseca wrote:
 On Fri, 2010-02-12 at 06:43 -0800, Roland Scheidegger wrote:
 On 12.02.2010 14:44, michal wrote:
 Keith Whitwell wrote on 2010-02-12 14:28:
 On Fri, 2010-02-12 at 05:09 -0800, michal wrote:
   
 Keith Whitwell wrote on 2010-02-12 13:39:
 
 On Fri, 2010-02-12 at 04:32 -0800, Micha?? Kr??l wrote:
   
   
 Module: Mesa
 Branch: master
 Commit: aa0b671422880b99dc178d43d1e4e1a3f766bf7f
 URL:
 http://cgit.freedesktop.org/mesa/mesa/commit/?id=aa0b671422880b99dc178d43d1e4e1a3f766bf7f

 Author: Michal Krol mic...@vmware.com
 Date:   Fri Feb 12 13:32:35 2010 +0100

 util: Fix descriptors for R32_FLOAT and R32G32_FLOAT formats.
 
 
 Michal,

 Is this more like two different users expecting two different results in
 those unused columns?

 In particular, we definitely require the missing elements to be extended
 to (0,0,0,1) when fetching vertex data, and probably also in OpenGL
 texture sampling (if we supported these formats for that).  

   
   
 Gallium should follow D3D rules, so I've been following D3D here. Also, 
 util_unpack_color_ub() in u_pack_color.h already sets the remaining 
 fields to 0xff.

 Note that D3D doesn't have the problem with expanding vertex attribute 
 data since you can't have X or XY vertex positions, only XYZ (with W 
 extended to 1 as in GL) and XYZW.
 
 But surely D3D permits two-component texture coordinates, which would be
 PIPE_FORMAT_R32G32_FLOAT, and expanded as (r,g,0,1)...

   
 Brian added a table of differences between GL and other APIs recently to
 gallium/docs - does your change agree with that?

   
   
 Where's that exactly, I can't find it?
 
 It seems like we'd want to be able to support both usages - the
 alternative in texture sampling would be forcing the state tracker to
 generate varients of the shader when 2-component textures are bound.  I
 would say that's an unreasonable requirement on the state tracker.

 It seems like in GL would want (0,0,0,1) expansion everywhere, but D3D
 would want differing expansions in different parts of the pipeline.
 That indicates a single flag in the context somewhere isn't sufficient
 to choose between the two.
  
 Maybe there need to be two versions of these PIPE_FORMAT_ enums to
 capture the different values in the missing components?

 EG:

PIPE_FORMAT_R32G32_0001_FLOAT
PIPE_FORMAT_R32G32__FLOAT

 ? or something along those lines??

   
 You are right.

 Alternatively, follow the more sane API (GL apparently), assume 0001 as 
 default and use the  infix to override.
 Note it's not just GL. D3D10 uses same expansion. Only D3D9 is
 different. Well for texture sampling anyway, I don't know what d3d does
 for vertex formats.

 Though for most hardware it would make sense to have only one format per
 different expansion, and use some swizzling parameter for sampling,
 because that's actually how the hardware works. But not all drivers will
 be able to do this, unfortunately.
 You mean, having a swizzle in pipe_sampler_state ?

 It sounds a good idea.

 In the worst case some component will inevitably need to make shader
 variants with different swizzles. In this case it probably makes sense
 to be the pipe driver -- it's a tiny shader variation which could be
 done without recompiling the whole shader, but if the state tracker does
 it then the pipe driver will always have to recompile.

 In the best case it is handled by the hardware's texture sampling unit.

 It's in theory similar to baking the swizzle in the format as Keith
 suggested, but cleaner IMHO. The question is whether it makes sense to
 have full xwyz01 swizzles, or just 01 swizzles.
 
 Another alternative is to just add the behaviour we really need - a
 single flag at context creation time that says what the behaviour of the
 sampler should be for these textures.
 
 Then the driver wouldn't have to worry about varients or mixing two
 different expansions.  Hardware (i965 at least) seems to have one global
 mode to switch between these, and that's all we need to choose the right
 behaviour for each state tracker.
 
 It might be simpler all round just to specify it at context creation.

Yes, for rg01 vs rg11 this is easiest. It doesn't solve the depth
texture mode problem though.
Also, we sort of have that flag already, I think there's no reason why
this needs to be separate from gl_rasterization_rules (though I guess in
that case it's a bit a misnomer...)

Roland

--
SOLARIS 10 is the OS for Data Centers - provides features such as DTrace,
Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW
http://p.sf.net/sfu/solaris-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] Mesa (master): util: Fix descriptors for R32_FLOAT and R32G32_FLOAT formats .

2010-02-12 Thread Roland Scheidegger
On 12.02.2010 19:00, Keith Whitwell wrote:
 On Fri, 2010-02-12 at 09:56 -0800, Roland Scheidegger wrote:
 On 12.02.2010 18:42, Keith Whitwell wrote:
 On Fri, 2010-02-12 at 09:28 -0800, José Fonseca wrote:
 On Fri, 2010-02-12 at 06:43 -0800, Roland Scheidegger wrote:
 On 12.02.2010 14:44, michal wrote:
 Keith Whitwell wrote on 2010-02-12 14:28:
 On Fri, 2010-02-12 at 05:09 -0800, michal wrote:
   
 Keith Whitwell wrote on 2010-02-12 13:39:
 
 On Fri, 2010-02-12 at 04:32 -0800, Micha?? Kr??l wrote:
   
   
 Module: Mesa
 Branch: master
 Commit: aa0b671422880b99dc178d43d1e4e1a3f766bf7f
 URL:
 http://cgit.freedesktop.org/mesa/mesa/commit/?id=aa0b671422880b99dc178d43d1e4e1a3f766bf7f

 Author: Michal Krol mic...@vmware.com
 Date:   Fri Feb 12 13:32:35 2010 +0100

 util: Fix descriptors for R32_FLOAT and R32G32_FLOAT formats.
 
 
 Michal,

 Is this more like two different users expecting two different results 
 in
 those unused columns?

 In particular, we definitely require the missing elements to be 
 extended
 to (0,0,0,1) when fetching vertex data, and probably also in OpenGL
 texture sampling (if we supported these formats for that).  

   
   
 Gallium should follow D3D rules, so I've been following D3D here. 
 Also, 
 util_unpack_color_ub() in u_pack_color.h already sets the remaining 
 fields to 0xff.

 Note that D3D doesn't have the problem with expanding vertex attribute 
 data since you can't have X or XY vertex positions, only XYZ (with W 
 extended to 1 as in GL) and XYZW.
 
 But surely D3D permits two-component texture coordinates, which would be
 PIPE_FORMAT_R32G32_FLOAT, and expanded as (r,g,0,1)...

   
 Brian added a table of differences between GL and other APIs recently 
 to
 gallium/docs - does your change agree with that?

   
   
 Where's that exactly, I can't find it?
 
 It seems like we'd want to be able to support both usages - the
 alternative in texture sampling would be forcing the state tracker to
 generate varients of the shader when 2-component textures are bound.  I
 would say that's an unreasonable requirement on the state tracker.

 It seems like in GL would want (0,0,0,1) expansion everywhere, but D3D
 would want differing expansions in different parts of the pipeline.
 That indicates a single flag in the context somewhere isn't sufficient
 to choose between the two.
  
 Maybe there need to be two versions of these PIPE_FORMAT_ enums to
 capture the different values in the missing components?

 EG:

PIPE_FORMAT_R32G32_0001_FLOAT
PIPE_FORMAT_R32G32__FLOAT

 ? or something along those lines??

   
 You are right.

 Alternatively, follow the more sane API (GL apparently), assume 0001 as 
 default and use the  infix to override.
 Note it's not just GL. D3D10 uses same expansion. Only D3D9 is
 different. Well for texture sampling anyway, I don't know what d3d does
 for vertex formats.

 Though for most hardware it would make sense to have only one format per
 different expansion, and use some swizzling parameter for sampling,
 because that's actually how the hardware works. But not all drivers will
 be able to do this, unfortunately.
 You mean, having a swizzle in pipe_sampler_state ?

 It sounds a good idea.

 In the worst case some component will inevitably need to make shader
 variants with different swizzles. In this case it probably makes sense
 to be the pipe driver -- it's a tiny shader variation which could be
 done without recompiling the whole shader, but if the state tracker does
 it then the pipe driver will always have to recompile.

 In the best case it is handled by the hardware's texture sampling unit.

 It's in theory similar to baking the swizzle in the format as Keith
 suggested, but cleaner IMHO. The question is whether it makes sense to
 have full xwyz01 swizzles, or just 01 swizzles.
 Another alternative is to just add the behaviour we really need - a
 single flag at context creation time that says what the behaviour of the
 sampler should be for these textures.

 Then the driver wouldn't have to worry about varients or mixing two
 different expansions.  Hardware (i965 at least) seems to have one global
 mode to switch between these, and that's all we need to choose the right
 behaviour for each state tracker.

 It might be simpler all round just to specify it at context creation.
 Yes, for rg01 vs rg11 this is easiest. It doesn't solve the depth
 texture mode problem though.
 Also, we sort of have that flag already, I think there's no reason why
 this needs to be separate from gl_rasterization_rules (though I guess in
 that case it's a bit a misnomer...)
 
 I'd prefer to avoid a big I'm a GL/DX9 context flag, and split
 different behaviours into different flags.  Sure, a GL state tracker
 might set them all one way, but that doesn't mean some future
 state-tracker wouldn't want to use a novel combination.
 
 The GL rasterization rules flag should be renamed to reflect what it's
 really asking for.
 
Ok

[Mesa3d-dev] nouveau changes for gallium-dynamicstencilref

2010-02-11 Thread Roland Scheidegger
Hi,

could one of the nouveau developers please take a look at the nv30
changes I did for the stencil ref changes in gallium-dynamicstencilref
branch?
I've just done that in a way I think it might make sense, but I've
absolutely no idea if it would work like that (and even if it would in
theory there might of course still be bugs in it...)
Also, I was a bit confused about the so_new() parameters as the numbers
didn't seem to add up (assuming it's basically the max number of
so_method and so_data calls).
Anyway, if it makes sense I can do nv40/nv50 too, if not tell me what
needs to be done instead or do it yourself :-). Or it will break after
the merge...

Roland

--
SOLARIS 10 is the OS for Data Centers - provides features such as DTrace,
Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW
http://p.sf.net/sfu/solaris-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] nouveau changes for gallium-dynamicstencilref

2010-02-11 Thread Roland Scheidegger
On 11.02.2010 21:42, Christoph Bumiller wrote:
 On 02/11/2010 09:02 PM, Roland Scheidegger wrote:
 Hi,

 could one of the nouveau developers please take a look at the nv30
 changes I did for the stencil ref changes in gallium-dynamicstencilref
 branch?
 I've just done that in a way I think it might make sense, but I've
 absolutely no idea if it would work like that (and even if it would in
 theory there might of course still be bugs in it...)
 Looks like it should work, I can't test nv30 myself though.
 
 Also, I was a bit confused about the so_new() parameters as the numbers
 didn't seem to add up (assuming it's basically the max number of
 so_method and so_data calls).
 It's (nr of so_method, nr of so_data + nr of so_reloc, nr of so_reloc),
 since relocs/addresses are considered data.
Ok that's what I figured. The numbers were just wrong, then (nv30 used
5/21/0 but the actual max was 4/22/0, nv40 uses 4/21/0 and it should
also be 4/22/0). Hence the confusion. At least things looked ok for nv50
(though it seems to needlessly split up the back face state into two
so_methods - anyway that'll change).

 
 Anyway, if it makes sense I can do nv40/nv50 too, if not tell me what
 needs to be done instead or do it yourself :-). Or it will break after
 the merge...

 I think you'll get it right. nv40 should be about the same as nv30,
 nv50 state is a bit less elegant but should be easy to adjust, too.

Ok.

Roland


--
SOLARIS 10 is the OS for Data Centers - provides features such as DTrace,
Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW
http://p.sf.net/sfu/solaris-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] fix the usual cell breakage

2010-02-08 Thread Roland Scheidegger
On 06.02.2010 15:07, Marc Dietrich wrote:
 also update the cell config a bit
 ---
  configs/linux-cell |6 ++--
  src/gallium/drivers/cell/common.h  |3 +-
  src/gallium/drivers/cell/spu/spu_per_fragment_op.c |   36 
 ++--
  3 files changed, 22 insertions(+), 23 deletions(-)

Sorry for that. Got confused there, thought the driver was using
cell_blend_state and not pipe_blend_state there...
cell_blend_state though actually seems to be only some remains from the
past...

Roland

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] [RFC]: gallium-nopointsizeminmax merge

2010-02-08 Thread Roland Scheidegger
On 08.02.2010 18:27, Brian Paul wrote:
 On Mon, Feb 8, 2010 at 10:21 AM, Roland Scheidegger srol...@vmware.com 
 wrote:
 This branch removes point_size_min and point_size_max because most
 hardware doesn't have any register to clamp this at rasterization time
 (from all gallium drivers, only r300 had this), and the mesa state
 tracker actually never used these field properly. The clamp to
 implementation limits will now be done in the vertex shader instead.
 Also, point_sprite enable is removed and replaced with a
 point_quad_rasterization field. The reason for this is that OGL actually
 has quite different rasterization rules for points and point sprites -
 hence this indicates if points should be rasterized as points or
 according to point sprite (which decomposes them into quads, basically)
 rules. It is unclear to me if we'd actually really need to do something
 different for these rules in the draw module or if hardware can do much
 with this information, but if there's hardware which can well you can
 use it.
 The point sprite coord enable is no longer also indicating the sprite
 coord origin, since there's no api interested in this per coord.

 Testing was done with softpipe, and actually pointblast doesn't work
 (does not draw any points at all), neither does spriteblast (some points
 have their size cut somewhere vertically so they are rectangles)
 correctly. However, these bugs are not introduced by this branch, those
 must be bugs in the draw module present before - I'm still trying to
 figure out what goes wrong.
 
 They're OK on master.  I fixed some breakage in this area last week.
 See 54d7ec8e769b588ec93dea5bc04399e91737557e for example.

Yes you're right, I missed that (probably because it wasn't in the draw
module, but the actual drivers). pointblast works again, and spriteblast
actually wasn't broken (parts of points simply disappeared due to depth
test...) when cherrypicking those commits to the branch.

Roland


--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] Gallium DRI fbconfig/visual setup

2010-02-05 Thread Roland Scheidegger
On 05.02.2010 22:48, Corbin Simpson wrote:
 Two things...
 
 Are accumbufs still slow in Gallium-land? Should we still mark them as slow?
 
 How many multisamples should we actually pretend/advertise? Should we
 have a cap to check the number of multisamples supported? Should we
 just say that four samples are done for the fbconfig/visual, and then
 replace pipe_texture::nr_samples with a multisample boolean flag?

I think it would be nice if we could support multiple MSAA levels. Sure
hardware typically can do 4xMSAA, but maybe you'd really want max
quality (with modern hw often offering 8x, and that's not taking
specialties like csaa into account) or only 2x.
Maybe cap should return a bitmask indicating what levels it supports.

Roland

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] [RFC] gallium-cylindrical-wrap branch

2010-02-03 Thread Roland Scheidegger
On 03.02.2010 16:07, michal wrote:
 Keith,
 
 This feature branch adds cylindrical wrap texcoord mode to gallium 
 shader tokens and removes prefilter field from sampler state. 
 Implemented cylindrical wrapping for linear interpolator in softpipe. 
 Not sure whether it makes sense to do it for perspective interpolator. 
 Documented TGSI declaration token.
 
 Sample fragment shader declaration that wraps S and T coordinates follows.
 
 DCL INPUT[0], GENERIC[0], LINEAR, CYLWRAP_XY
 
 Please review so I can merge it to master.

Michal,

why do you need this for linear interpolator and not perspective? I
think d3d mobile let you disable perspective correct texturing, but it
is always enabled for normal d3d.

Roland

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] [RFC] gallium-cylindrical-wrap branch

2010-02-03 Thread Roland Scheidegger
On 03.02.2010 17:45, michal wrote:
 Roland Scheidegger wrote on 2010-02-03 16:47:
 On 03.02.2010 16:07, michal wrote:
   
 Keith,

 This feature branch adds cylindrical wrap texcoord mode to gallium 
 shader tokens and removes prefilter field from sampler state. 
 Implemented cylindrical wrapping for linear interpolator in softpipe. 
 Not sure whether it makes sense to do it for perspective interpolator. 
 Documented TGSI declaration token.

 Sample fragment shader declaration that wraps S and T coordinates follows.

 DCL INPUT[0], GENERIC[0], LINEAR, CYLWRAP_XY

 Please review so I can merge it to master.
 
 Michal,

 why do you need this for linear interpolator and not perspective? I
 think d3d mobile let you disable perspective correct texturing, but it
 is always enabled for normal d3d.
   
 I could not think of a use case that uses perspective and cylindrical 
 interpolation at the same time. If you think it's valid, we can 
 implement cylindrical wrapping for perspective interpolator, but then I 
 am not sure how exactly it should be done, i.e. should we divide and 
 then wrap or the opposite?

Good question. Unfortunately the description of what the wrap
renderstate does doesn't say anything about that. I just assumed since
perspective correction is usually always enabled, it would be enabled
even when wrapping is used on some coordinates. Not sure what the order
of wrap/divide should be... Also, d3d lets you wrap the 4th coordinate,
does that really make sense?

Roland


--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] Grab bag of random questions (whoo)

2010-02-01 Thread Roland Scheidegger
On 31.01.2010 18:41, Christoph Bumiller wrote:
 On 31.01.2010 01:37, Roland Scheidegger wrote:
 Marek Olšák wrote:
   
 6) GL_ARB_shadow_ambient and texture compare-fail values. A comment in
 the fragment constants reads, Since Gallium doesn't support
 GL_ARB_shadow_ambient, this is always (0,0,0,0), right?

 I think the extension could be added to Gallium since the r300 compiler 
 can generate code for it.
 
 It could. But generally, gallium doesn't implement features common 
 hardware can't do (not only because most drivers except software based 
 ones couldn't implement it, but those features also turn out to be 
 rarely used, for obvious reasons). r300 is an exception here since it 
 emulates ARB_shadow anyway. Though I think if you can make a case why 
 this is really necessary it could be done, but that's not my call.

   
 Another
 comment reads, Gallium doesn't provide us with any information
 regarding this mode, so we are screwed. I'm setting 0 = LUMINANCE,
 above the texture compare modes. I don't really like that section of
 code, but it probably can't get cleaner, right?

 Even though this is a rarely used feature in OpenGL nowadays, it should 
 get fixed if we want to be GL-compliant. That means adding depth texture 
 modes in pipe_sampler_state and setting them in the Mesa state tracker. 
 The R300 compiler can already generate code for these modes as well.
 
 Note R300 is again special a bit here.
 Actually, I realized my earlier answer doesn't make sense. Hardware 
 which actually supports EXT_texture_swizzle (and native ARB_shadow) 
 should be able to implement this easily. Hardware like i965 which 
 doesn't support EXT_texture_swizzle could do it in the shader.
 Maybe it would make sense to add EXT_texture_swizzle capability in 
 gallium (in the sampler state). That would solve this in a bit more 
 generic way than some special bits for depth texture mode.

   
 From my point of view adding a swizzle in the sampler state
 is a bad idea.
 
 On nv50, this would make texture setup dependent on
 sampler state: we have an Image and Sampler configuration
 buffer containing entries that can be bound to texture and
 sampler units.
 The texture swizzle would be supported by setting a different
 format in the image entry, like BGRA instead of RGBA, just
 that it also supports RGRG or whatever you like.
 
 Well, the normalization bit seems to be stored in the TIC
 entries instead of the TSC ones already, I guess that comes
 from the rectangle texture type, but let's ignore that.
 
 I don't see texture swizzle in d3d10 (but then, I don't know d3d10
 very well), and OpenGL doesn't separate textures and samplers
 anyway, so I'd put in in texture state.
 Keeping a bunch of shaders for texture swizzles doesn't sound
 nice either.
 
 Of course, if other hardware would prefer this in sampler state,
 then ... ah, I should probably let go of the illusion that gallium state
 will continue to nicely map to my hardware ...

I don't know if other hardware likes it in sampler state, but the
problem is it really is sampler state. This is not a property of the
texture, it can change anytime and you don't want to recreate the
texture just because this changes, I think.
I'm not sure how to implement this nicely neither, but I'd guess we'd at
least wanted to indicate the swizzle fields to correspond to hardware
channels (so for a luminance_alpha texture, the swizzle would indicate
rrrg for the first and second channels respectively), not the
GL-after-sampling mapping as the extension uses.
Hence depth textures used as luminance would be rrr1, as alpha 000r, and
as intensity .
So, basically, a texture a8l8 texture would be equivalent to a r8g8
texture, when used for sampling with the same swizzling.
Note that an easy solution (for depth textures) would be to add just new
depth texture formats (one for each alpha, luminance, and intensity
mode), but then again you make this part of the texture, which it is not.
Those are just some quick thoughts however, I don't think anyone would
be opposed if you can come up with a nice solution for this.


Roland

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] Grab bag of random questions (whoo)

2010-02-01 Thread Roland Scheidegger
On 01.02.2010 20:23, Brian Paul wrote:
 Speaking of texture formats and texture sampling, one area of Gallium 
 that's under-specified is what (x,y,z,w) values are returned by TEX 
 instructions when sampling from each of the various texture formats.
 
 A while back I started a table comparing OpenGL to D3D:
 
 
 texture componentsOpenGL  D3D
 ---
 R,G,B,A   (R,G,B,A)  (R,G,B,A)
 R,G,B (R,G,B,1)  (R,G,B,1)
 R,G   (R,G,0,1)  (R,G,1,1)
 R (R,0,0,1)  (R,1,1,1)
 A (0,0,0,A)  (0,0,0,A)
 L (L,L,L,1)  (L,?,?,1) (probably L,L,L,1)
 I (I,I,I,I)  (?,?,?,?)
 UV(0,0,0,1)* (U,V,1,1)
 Z (Z,Z,Z,Z) or   (0,Z,0,1)
(Z,Z,Z,1) or
(0,0,0,Z)**
 other formats?......
AL. Should be (L,L,L,A) for both OGL And D3D. And yes, (L,L,L,1) is
correct for D3D (that's what i965 at least does). There are no intensity
textures in d3d (unless you can somehow support that via cap bits, but
in that case I'd certainly expect it to be (I,I,I,I)).
UV is of course really odd in OGL, since it says the sample result is
constant but of course you still use the UV components for the bump
target. That's just an oddity to make that fit somehow into fixed
function pipeline rather than it has anything to do with hardware.

Note that the D3D column is only valid for DX9 (and older). DX10 uses
the same mappings as OpenGL (if it supports the format, all luminance,
alpha etc. textures are gone, as are the swizzled bgra formats).

 
 * per http://www.opengl.org/registry/specs/ATI/envmap_bumpmap.txt
 ** depends on GL_DEPTH_TEXTURE_MODE state
 
 For OpenGL, see page 141 of the OpenGL 3.1 spec.
 For D3D, see http://msdn.microsoft.com/en-us/library/ee422472(VS.85).aspx
 
 
 We should first add a column to the above table for Gallium and then 
 decide whether to implement swizzling (and GL_DEPTH_TEXTURE_MODE) with 
 extra GPU instructions or new texture/sampler swizzle state.
But most gpus can do arbitrary swizzling natively, hence inserting gpu
instructions really must be optional. Even hardware which can't do
arbitrary swizzling can sometimes do both OGL and D3D mapping, hence we
don't really want additional instructions there neither (i965 being the
example, though it's not easy to switch behavior, since that affects not
only the format of the border color too but also how the border color is
used if the particular channel isn't in the texture).
I think we'd want DX10/OGL behavior, and u_format defines it that way.
Except for depth/stencil formats, where the depth always ends up in the
red channel and stencil in green (with the rest undefined).
i965 actually has different depth/stencil formats (a24x8, l24x8, i24x8)
just for those depth texture modes (though the code suggests it won't do
anything if shadow comparison is enabled).
Or maybe we'd want additional formats just for DX9 - sounds like
overkill though. The different border color interpretation of i965
suggests to me that won't do much on its own for conformance anyway.
I think the swizzle values used by u_format are nice. Using xyzw rather
than rgba to refer to the first, etc. channel avoids confusion. Hence
I'd propose we'd use the same for the hypothetical sampler swizzle state
(that is x,y,z,w,0,1, not sure if the _ undefined makes sense there).
The swizzling would be the same as that indicated in u_format for all
textures initially, except depth/stencil.

Roland

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] Grab bag of random questions (whoo)

2010-01-30 Thread Roland Scheidegger
On 30.01.2010 13:06, Corbin Simpson wrote:
 Handful of random things bugging me.

 2) progs/tests/drawbuffers and progs/tests/drawbuffers2, and possibly
 others, segfault with both softpipe and the HW driver at
 sl_pp_version.c:45. I think there's some codegen going on there? At
 any rate, if anybody has any hints on how to solve it, that'd be nice.
Works for me (with softpipe).

 
 6) GL_ARB_shadow_ambient and texture compare-fail values. A comment in
 the fragment constants reads, Since Gallium doesn't support
 GL_ARB_shadow_ambient, this is always (0,0,0,0), right?
I'd think so. This extension isn't in core GL (and d3d can't do it
neither), and AFAIK there's no hardware (which doesn't emulate the
shadow functionality in the fragment shader) which could actually do it.

 Another
 comment reads, Gallium doesn't provide us with any information
 regarding this mode, so we are screwed. I'm setting 0 = LUMINANCE,
 above the texture compare modes. I don't really like that section of
 code, but it probably can't get cleaner, right?
Yes, that's not very clean, but there doesn't seem to be an easy
solution for this. Exposing this in gallium only seems marginally
useful, since again modern hardware can't really do anything useful with
that information neither. Maybe would need to tweak the shader if
actually the wrong channels are used (probably shouldn't be the
driver's responsibility to do this), but I guess assuming default
LUMINANCE works just fine usually. New OGL won't have that problem...

 
 7) Is there more information on the dual-source blend modes? I'm not
 sure if I can do them; might have to bug AMD for the register values.
Pretty sure most pre-DX10 hardware can't do that, the blend unit just
doesn't have access to multiple source colors.
I've attached a small patch which shows how softpipe implements it.
(But I still need to write a testcase probably for the python
statetracker to see if it actually works...).
Pretty easy really, just using pixel shader color outputs 0 and 1 for
blending (note that in this form this restricts dual source blending to
one render target, this is the same restriction DX10 enforces, and if
you look at i965 docs it actually has this restriction in hardware).

 
 I think that's it for now. Sorry for all the questions, but I'm really
 starting to get a good handle on the hardware and interface, and I'm
 ready to start beating the classic driver in serious benchmarks; I
 think that r300's probably the most mature driver alongside nv50 and
 maybe nv40.
Great!

Roland
diff --git a/src/gallium/drivers/softpipe/sp_quad_blend.c b/src/gallium/drivers/softpipe/sp_quad_blend.c
index d65307b..85fda0b 100644
--- a/src/gallium/drivers/softpipe/sp_quad_blend.c
+++ b/src/gallium/drivers/softpipe/sp_quad_blend.c
@@ -222,7 +222,7 @@ logicop_quad(struct quad_stage *qs,
 
 static void
 blend_quad(struct quad_stage *qs, 
-   float (*quadColor)[4],
+   float (*quadColors)[4][4],
float (*dest)[4],
unsigned cbuf)
 {
@@ -230,6 +230,7 @@ blend_quad(struct quad_stage *qs,
static const float one[4] = { 1, 1, 1, 1 };
struct softpipe_context *softpipe = qs-softpipe;
float source[4][QUAD_SIZE] = { { 0 } };
+   float (*quadColor)[4] = quadColors[cbuf];
 
/*
 * Compute src/first term RGB
@@ -298,11 +299,23 @@ blend_quad(struct quad_stage *qs,
}
break;
case PIPE_BLENDFACTOR_SRC1_COLOR:
-  assert(0); /* to do */
-  break;
+   {
+  float (*quadColor1)[4] = quadColors[1];
+  assert(cbuf == 0);
+  VEC4_MUL(source[0], quadColor[0], quadColor1[0]); /* R */
+  VEC4_MUL(source[1], quadColor[1], quadColor1[1]); /* G */
+  VEC4_MUL(source[2], quadColor[2], quadColor1[2]); /* B */
+   }
+   break;
case PIPE_BLENDFACTOR_SRC1_ALPHA:
-  assert(0); /* to do */
-  break;
+   {
+  const float *alpha = quadColors[1][3];
+  assert(cbuf == 0);
+  VEC4_MUL(source[0], quadColor[0], alpha); /* R */
+  VEC4_MUL(source[1], quadColor[1], alpha); /* G */
+  VEC4_MUL(source[2], quadColor[2], alpha); /* B */
+   }
+   break;
case PIPE_BLENDFACTOR_ZERO:
   VEC4_COPY(source[0], zero); /* R */
   VEC4_COPY(source[1], zero); /* G */
@@ -372,11 +385,29 @@ blend_quad(struct quad_stage *qs,
}
break;
case PIPE_BLENDFACTOR_INV_SRC1_COLOR:
-  assert(0); /* to do */
-  break;
+   {
+  float (*quadColor1)[4] = quadColors[1];
+  float inv_comp[4];
+  assert(cbuf == 0);
+  VEC4_SUB(inv_comp, one, quadColor1[0]); /* R */
+  VEC4_MUL(source[0], quadColor[0], inv_comp); /* R */
+  VEC4_SUB(inv_comp, one, quadColor1[1]); /* G */
+  VEC4_MUL(source[1], quadColor[1], inv_comp); /* G */
+  VEC4_SUB(inv_comp, one, quadColor1[2]); /* B */
+  VEC4_MUL(source[2], quadColor[2], inv_comp); /* B */
+   }
+   break;
case PIPE_BLENDFACTOR_INV_SRC1_ALPHA:
-  assert(0); /* to do */
-  break;
+   {
+  const float *alpha = quadColors[1][3];
+  float 

Re: [Mesa3d-dev] Grab bag of random questions (whoo)

2010-01-30 Thread Roland Scheidegger
Marek Olšák wrote:
 6) GL_ARB_shadow_ambient and texture compare-fail values. A comment in
 the fragment constants reads, Since Gallium doesn't support
 GL_ARB_shadow_ambient, this is always (0,0,0,0), right?
 
 I think the extension could be added to Gallium since the r300 compiler 
 can generate code for it.
It could. But generally, gallium doesn't implement features common 
hardware can't do (not only because most drivers except software based 
ones couldn't implement it, but those features also turn out to be 
rarely used, for obvious reasons). r300 is an exception here since it 
emulates ARB_shadow anyway. Though I think if you can make a case why 
this is really necessary it could be done, but that's not my call.

 
 Another
 comment reads, Gallium doesn't provide us with any information
 regarding this mode, so we are screwed. I'm setting 0 = LUMINANCE,
 above the texture compare modes. I don't really like that section of
 code, but it probably can't get cleaner, right?
 
 Even though this is a rarely used feature in OpenGL nowadays, it should 
 get fixed if we want to be GL-compliant. That means adding depth texture 
 modes in pipe_sampler_state and setting them in the Mesa state tracker. 
 The R300 compiler can already generate code for these modes as well.
Note R300 is again special a bit here.
Actually, I realized my earlier answer doesn't make sense. Hardware 
which actually supports EXT_texture_swizzle (and native ARB_shadow) 
should be able to implement this easily. Hardware like i965 which 
doesn't support EXT_texture_swizzle could do it in the shader.
Maybe it would make sense to add EXT_texture_swizzle capability in 
gallium (in the sampler state). That would solve this in a bit more 
generic way than some special bits for depth texture mode.

 
 7) Is there more information on the dual-source blend modes? I'm not
 sure if I can do them; might have to bug AMD for the register values.
 
 I bet R300 can't do these modes. It's only a Direct3D 10.0 feature, not 
 present in Direct3D 10.1. MS must have a good reason to remove it.
Where did you see that it's removed in 10.1?
Here's a list of blend ops in d3d11:
http://msdn.microsoft.com/en-us/library/ee416042(VS.85).aspx
Note this feature can be present (via cap bits in some limited form) in 
D3D9Ex too, and I thought windows actually used it for (antialiased) 
text rendering (but don't quote me on that).

 
 BTW I looked at some of your patches and r3xx-r5xx cards don't even 
 support separate blend enables, therefore the cap should be 0. Or are 
 you going to emulate this using independent color channel masks and two 
 rendering passes? That could be done in the state tracker. Also, I think 
 the indep. color masks are r5xx-only.
I also think even r500 shouldn't say this is supported. Just changing 
the colormasks isn't going to be very correct...

Roland


--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] Grab bag of random questions (whoo)

2010-01-30 Thread Roland Scheidegger
Corbin Simpson wrote:
 
 Another
 comment reads, Gallium doesn't provide us with any information
 regarding this mode, so we are screwed. I'm setting 0 = LUMINANCE,
 above the texture compare modes. I don't really like that section of
 code, but it probably can't get cleaner, right?
 Yes, that's not very clean, but there doesn't seem to be an easy
 solution for this. Exposing this in gallium only seems marginally
 useful, since again modern hardware can't really do anything useful with
 that information neither. Maybe would need to tweak the shader if
 actually the wrong channels are used (probably shouldn't be the
 driver's responsibility to do this), but I guess assuming default
 LUMINANCE works just fine usually. New OGL won't have that problem...
 
 New OGL? GL3? Sweet.
Well, all the luminance/intensity stuff goes away, so problem solved 
:-). Even when mesa will get to GL3 eventually, we'd still need to deal 
with that for ARB_compatibility however at least in theory.
See also my answer in the other email, I was quite wrong that hardware 
typically can't do much with it.

Roland

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] [PATCH] hack around commas in macro argument

2010-01-26 Thread Roland Scheidegger
On 26.01.2010 09:18, Marvin wrote:
 Jose, Brian,
 
 Marc,

 Why is this necessary? It has been working fine so far. Which gcc version
  are you using? What commas are you referring to?
 
 the PIPE_ALIGN_TYPE macro is so far only used in the cell driver in 
 src/gallium/drivers/cell/spu/spu_main.c  (this is probably why no one noticed 
 it).
 
 The marco takes a type, a stuct in this case, which can include commas:
 
 PIPE_ALIGN_TYPE(16,
 struct spu_framebuffer
 {
void *color_start;  /** addr of color surface in main memory 
 */
void *depth_start;  /** addr of depth surface in main memory 
 */
enum pipe_format color_format;
enum pipe_format depth_format;
uint width, height; /** size in pixels */
 ^^^
 
uint width_tiles, height_tiles; /** width and height in tiles */
   ^^^
 
uint color_clear_value;
uint depth_clear_value;
 
uint zsize; /** 0, 2 or 4 bytes per Z */
float zscale;   /** 65535.0, 2^24-1 or 2^32-1 */
 });
 
 This will cause a problem, as the macro will thread each comma as an argument 
 seperator and thus the number of arguments is larger than 2.

Hmm, maybe could just avoid the problem by not using commas in the
struct declaration?

Roland


--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


[Mesa3d-dev] perrtblend merge

2010-01-26 Thread Roland Scheidegger
Hi,

I'm planning on merging this branch to master soon.
This will make it possible to do per render target blend enables,
colormasks, and also per rendertarget blend funcs (with a different CAP
bit for the latter, and this one isn't actually used in mesa state
tracker yet).
None of the drivers other than softpipe implement any of it, but they
were adapted to the interface changes so should continue to run.
Apparently, that functionality is only interesting for drivers
supporting multiple render targets, and the hw probably needs to be
quite new (I know that i965 could support it (well not the multiple
blend funcs but the rest), but the driver currently only supports 1
render target).

Roland

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] perrtblend merge

2010-01-26 Thread Roland Scheidegger
Oh, I should have added the PIPE_CAP bits (even if not supported) to all
drivers. Good catch. I'll do that for the other drivers now.

Roland

(btw, I think r500 could do separate colormasks, but not separate blend
enables, and there might be more hardware like that. However, this is
not exposed by GL, it might be supported by some DX9 cap bit, but it
didn't seem worthwile to add a separate gallium cap bit for supporting
per-rt blend enables and colormasks, respectively.)

On 26.01.2010 16:37, Corbin Simpson wrote:
 Yeah, r300 doesn't but r600 does. I've read through the branch, and
 the r300g patch looks perfect. I've pushed another patch on top for
 the pipe caps, to avoid post-merge cleanups for myself.
 
 On Tue, Jan 26, 2010 at 7:00 AM, Alex Deucher alexdeuc...@gmail.com wrote:
 On Tue, Jan 26, 2010 at 9:44 AM, Roland Scheidegger srol...@vmware.com 
 wrote:
 Hi,

 I'm planning on merging this branch to master soon.
 This will make it possible to do per render target blend enables,
 colormasks, and also per rendertarget blend funcs (with a different CAP
 bit for the latter, and this one isn't actually used in mesa state
 tracker yet).
 None of the drivers other than softpipe implement any of it, but they
 were adapted to the interface changes so should continue to run.
 Apparently, that functionality is only interesting for drivers
 supporting multiple render targets, and the hw probably needs to be
 quite new (I know that i965 could support it (well not the multiple
 blend funcs but the rest), but the driver currently only supports 1
 render target).
 FWIW, AMD R6xx+ hw supports MRTs and per-MRT blends as well, although
 at the moment the driver also only supports 1 RT.

 Alex

 --
 The Planet: dedicated and managed hosting, cloud storage, colocation
 Stay online with enterprise data centers and the best network in the business
 Choose flexible plans and management services without long-term contracts
 Personal 24x7 support from experience hosting pros just a phone call away.
 http://p.sf.net/sfu/theplanet-com
 ___
 Mesa3d-dev mailing list
 Mesa3d-dev@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

 
 
 


--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] What about gl_rasterization_rules?

2010-01-21 Thread Roland Scheidegger
On 21.01.2010 18:47, Luca Barbieri wrote:
 On Thu, Jan 21, 2010 at 6:34 PM, Corbin Simpson
 mostawesomed...@gmail.com wrote:
 Maybe it's just me, since I actually wrote the docs, but does anybody
 else read them?

 From cso/rasterizer.html (viewable at e.g.
 http://people.freedesktop.org/~csimpson/gallium-docs/cso/rasterizer.html
 ):

 gl_rasterization_rules
Whether the rasterizer should use (0.5, 0.5) pixel centers. When
 not set, the rasterizer will use (0, 0) for pixel centers.

 So why aren't these patches using 
 pipe_rasterizer_state::gl_rasterization_rules?
 
 It's a different thing.
 gl_rasterization_rules affects the way fragments are rasterized, i.e.
 the set of fragments which a primitive is mapped to.
 Changing it is equivalent to adding/subtracting a subpixel offset to
 the viewport (which seemingly depends on the primitive type).
 
 The pixel center convention instead sets how the values look like in
 the fragment shader.
 Changing it is equivalent to adding/subtracting 0.5 to the
 fragment.position in the fragment shader.
 
 In other words, yes, if you set gl_rasterization_rules and the pixel
 center in a mismatched way, fragment.position will not be the
 coordinate of the rasterization center.
 
 As another example, suppose you do a blit with the 3D engine using
 fragment.position to sample from a texture rectangle with bilinear
 filtering.
 A wrong rasterization convention may cause 1 pixel black bars at the borders.
 A wrong pixel center convention will cause a 2x2 blur filter to be
 applied to the texture.
 
 BTW, gl_rasterization_rules is ignored by almost all drivers

Most but not all. Not the software based ones, for instance. Should be
easy to add to r300 (and the nouveau ones, I assume), I guess these
simply don't care enough about environments with different (= DX9)
rasterization rules :-).

Roland

 
From the spec:
 
 The scope of this extension deals *only* with how the fragment
 coordinate XY location appears during programming fragment processing.
 Beyond the scope of this extension are coordinate conventions used
 for rasterization or transformation.
 
 --
 Throughout its 18-year history, RSA Conference consistently attracts the
 world's best and brightest in the field, creating opportunities for Conference
 attendees to learn about information security's most important issues through
 interactions with peers, luminaries and emerging and established companies.
 http://p.sf.net/sfu/rsaconf-dev2dev
 ___
 Mesa3d-dev mailing list
 Mesa3d-dev@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


--
Throughout its 18-year history, RSA Conference consistently attracts the
world's best and brightest in the field, creating opportunities for Conference
attendees to learn about information security's most important issues through
interactions with peers, luminaries and emerging and established companies.
http://p.sf.net/sfu/rsaconf-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] [RFC] gallium-multiple-constant-buffers merge

2010-01-21 Thread Roland Scheidegger
On 21.01.2010 20:20, michal wrote:
 Hi,
 
 This simple feature branch adds support for two-dimensional constant 
 buffers in TGSI.
 
 An example shader would look like this:
 
 FRAG
 
 DCL IN[0], COLOR, LINEAR
 DCL OUT[0], COLOR
 DCL CONST[1][1..2]
 
 MAD OUT[0], IN[0], CONST[1][2], CONST[1][1]
 
 END
 
 For this to work, one needs to bind a buffer to slot nr 1 containing at 
 least 3 vectors.
 

Looks good to me - I wondered how you'd use the multiple constant
buffers possible by the gallium interface, and that is how :-).
Is that something we'd need a cap bit for in the future? Would this be
also used by ARB_uniform_buffer_object / GL 3.1?

Roland

--
Throughout its 18-year history, RSA Conference consistently attracts the
world's best and brightest in the field, creating opportunities for Conference
attendees to learn about information security's most important issues through
interactions with peers, luminaries and emerging and established companies.
http://p.sf.net/sfu/rsaconf-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] [PATCH 2/2] st: don't assert on empty fragment program

2010-01-18 Thread Roland Scheidegger
On 18.01.2010 19:15, Luca Barbieri wrote:
 Breakpoint 3, _mesa_ProgramStringARB (target=34820, format=34933,
 len=70, string=0x85922ba) at shader/arbprogram.c:434
 434  GET_CURRENT_CONTEXT(ctx);
 $31 = 0x85922ba !!ARBfp1.0\n\nOPTION
 ARB_precision_hint_fastest;\n\n\n\nEND\n
 
 Not sure why Sauerbraten does this, but it does, at least on my system
 (Ubuntu Karmic, nv40 driver) and it should be legal.

Probably depth writes only enabled for things like shadows?

Roland

--
Throughout its 18-year history, RSA Conference consistently attracts the
world's best and brightest in the field, creating opportunities for Conference
attendees to learn about information security's most important issues through
interactions with peers, luminaries and emerging and established companies.
http://p.sf.net/sfu/rsaconf-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] Gallium feature levels

2010-01-12 Thread Roland Scheidegger
On 11.01.2010 22:03, Zack Rusin wrote:
 On Monday 11 January 2010 15:17:00 Roland Scheidegger wrote:
 - extra mirror wrap modes - i don't think mirror repeat was ever
 supported and mirror clamp was removed in d3d10 but it seems that some
 hardware kept support for those
 Mirror repeat is a core feature in GL since 1.4 hence we can't just drop
 it. 
 
 I wasn't suggesting that. I was just pointing out what happens with it from 
 the D3D side.
 
 I think all hardware we'd ever care about would support it. mirror
 clamp / mirror clamp to edge are only an extension, though
 (ATI_texture_mirror_once). (I think the dx mirror once definition is
 probably mirror_clamp_to_edge in opengl parlance).
 
 That's possible. As mentioned I'm not really sure what to do with this 
 feature.
 
 - shadow maps - it's more of an researched guess since it's largely
 based on a format support, but as far as i can tell all d3d10 hardware
 supports it, earlier it varies (e.g. nvidia did it for ages)
 Required for GL 1.4. I thought it was pretty much required for d3d
 sm2.0, though you're right you could probably just not support the
 texture format there. Anyway, most hardware should support it, I believe
 even those which didn't really supported it at DX9 SM 2.0 time supported
 it (chips like radeon r300 lacked the hw to do the comparison in the
 texture unit, but it can be more or less easily implemented in the pixel
 shader, though the implementation will suck as it certainly won't do PCF
 just use some point sampling version - unless you're willing to do a
 much more complex implementation in the pixel shader, but then on this
 generation of hardware you might exceed maximum shader length). I
 believe all hardware supporting SM 2.0 could at least do some sampling
 of depth textures, though possibly only 16 bit and I'm not sure
 filtering worked in all cases.
 
 Yes, but the issue is that I'm not sure how to represent it from a feature 
 level case. Are you saying we should just enable it for all feature levels? 
 That'd be nice.
Hmm, maybe.

 
 I think the other stuff is acceptable. Take a look at the docs and let me
 know what you think.
 What is feature level 1 useful for? I thought we'd really wanted DX9
 level functionality as a bare minimum. GL2.x certainly needs cards
 supporting shader model 2 (and that is a cheat, in reality it would be
 shader model 3).
 
 The main issue was having something without hardware vs in the feature 
 levels. 
 It was supposed to be whatever the current i915 driver currently supports, 
 but 
 yea, I think it doesn't make sense and level 2 should be minumum.
 
 Also, I don't quite get the shader model levels. I thought there were
 mainly two different DX9 versions, one with sm 2.0 the other with 3.0,
 with noone caring about other differences (as most stuff was cappable
 anyway). However, you've got 3 and all of them have 2.0 shader model?
 
 As mentioned this is based on the D3D feature level concept. It's the first 
 link I put in the the references:
 http://msdn.microsoft.com/en-us/library/ee422086(VS.85).aspx#Overview
 It's there because that's what Microsoft defined as feature level and I'm 
 assuming it's because they had a good need for it :)
Ah, that's why it doesn't make much sense :-).
I'm not sure what requirements got them to these levels. I definitely
think those 3 dx9 levels are very odd and don't even make sense for d3d
only, much less for gallium. For example, requires at least max aniso
16? You got to be kidding, aniso spec is so fuzzy you can pass any cheap
point filter as compliant (well almost) anyway, so it doesn't make any
sense (plus, this only really enhances filtering quality, it makes
absolutely zero difference for writing applications).
I think the retrofit of 9_1, 9_2, 9_3 to some arbitrary DX9 versions
doesn't match hardware really neither. The most distinguishable feature
of DX9.0c (which was the last version IIRC) was definitely SM 3.0, but
of course like everything else (multiple render targets, etc.) it was
optional. I think for gallium it would make way more sense to expose
only 2 feature levels - basically drop 9_1, and additionally bump 9_3 to
include SM 3.0 (I wonder if that's not just a typo there, after all the
model is called ps_4_0_level_9_3 unlike the others which are called
9_1 only).
Though granted nv20/25 can't do separate alpha blend (but it can't do
fragment shaders neither really so I don't know how well that driver is
ever going to work), i915 may not be able to do occlusion queries (not
sure if hw can't do it but the current driver doesn't implement it),
everybody (I think) can do mirror_once, and I don't know what
overlapping vertex elements are.

 
 More comments below.

 +static const enum pipe_feature_level
 +i915_feature_level(struct pipe_screen *screen)
 +{
 +   return PIPE_FEATURE_LEVEL_1;
 +}
 What's the reason this is not feature level 2?
 
 Yea, I was winging it for all the drivers because I couldn't be bothered to 
 do 
 a cross

[Mesa3d-dev] gallium-noconstbuf merge

2010-01-11 Thread Roland Scheidegger
Hi,

I'll plan to merge gallium-noconstbuf today. It's a pretty simple API
change, so things should continue to run :-).

Roland

--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] gallium-noconstbuf merge

2010-01-11 Thread Roland Scheidegger
On 11.01.2010 16:42, Keith Whitwell wrote:
 On Mon, 2010-01-11 at 07:33 -0800, Roland Scheidegger wrote:
 Hi,

 I'll plan to merge gallium-noconstbuf today. It's a pretty simple API
 change, so things should continue to run :-).
 
 Roland,
 
 Before you do this, can you make sure that the set_constant_buffer()
 entrypoint is properly documented in gallium/docs?

I was planning to do that after the merge, since the branch is too old
to include docs, so I'd have to merge from master just for that if I'd
do it before the merge.

Roland

--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] gallium-noconstbuf merge

2010-01-11 Thread Roland Scheidegger
On 11.01.2010 16:53, Keith Whitwell wrote:
 On Mon, 2010-01-11 at 07:50 -0800, Roland Scheidegger wrote:
 On 11.01.2010 16:42, Keith Whitwell wrote:
 On Mon, 2010-01-11 at 07:33 -0800, Roland Scheidegger wrote:
 Hi,

 I'll plan to merge gallium-noconstbuf today. It's a pretty simple API
 change, so things should continue to run :-).
 Roland,

 Before you do this, can you make sure that the set_constant_buffer()
 entrypoint is properly documented in gallium/docs?
 I was planning to do that after the merge, since the branch is too old
 to include docs, so I'd have to merge from master just for that if I'd
 do it before the merge.
 
 OK -- that's a decent excuse...  Can you post a first draft of the docs
 here before merging?

Ok here's a first stab. Actually I'm not sure how documentation should
look like, there are no other functions really commented yet. Should
these include function parameters / return values?
Also I'll need to work on the syntax a bit I know...

Roland
diff --git a/src/gallium/docs/source/context.rst b/src/gallium/docs/source/context.rst
index 21f5f91..a6fe408 100644
--- a/src/gallium/docs/source/context.rst
+++ b/src/gallium/docs/source/context.rst
@@ -34,6 +34,12 @@ buffers, surfaces) are bound to the driver.
 
 
 * ``set_constant_buffer``
+void (*set_constant_buffer)( struct pipe_context *,
+ uint shader, uint index,
+ struct pipe_constant_buffer *buf );
+sets a constant buffer to be used in a given shader type. index is
+used to indicate which buffer to set (note that some apis allow multiple ones
+to be set, though drivers are mostly restricted to the first one right now).
 * ``set_framebuffer_state``
 * ``set_fragment_sampler_textures``
 * ``set_vertex_sampler_textures``
--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev ___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] Gallium feature levels

2010-01-11 Thread Roland Scheidegger
On 11.01.2010 18:49, Zack Rusin wrote:
 Hey,
 
 knowing that we're starting to have serious issues with figuring out what 
 features the given device supports and what api's/extensions can be 
 reasonably 
 implemented on top of it I've spent the weekend trying to define feature 
 levels. Feature levels were effectively defined by the Direct3D version 
 numbers. 
 Attached is a patch and documentation for the feature levels. I'm also 
 attaching gallium_feature_levels.rst file which documents what each feature 
 level means and what apis can be reasonably supported by each (I figured it's 
 going to be easier to look at it outside the diff).
 
 There's a few features that are a bit problematic, in no particular order:
 - unnormalized coordinates, we don't even have a cap for those right now but 
 since that feature doesn't exist in direct3d (all coords are always 
 normalized 
 in d3d) the support for it is hard to define in term of a feature level
 - two-sided stencil - d3d supports it in d3d10 but tons of hardware supported 
 it earlier
 - extra mirror wrap modes - i don't think mirror repeat was ever supported 
 and 
 mirror clamp was removed in d3d10 but it seems that some hardware kept 
 support 
 for those
Mirror repeat is a core feature in GL since 1.4 hence we can't just drop
it. I think all hardware we'd ever care about would support it. mirror
clamp / mirror clamp to edge are only an extension, though
(ATI_texture_mirror_once). (I think the dx mirror once definition is
probably mirror_clamp_to_edge in opengl parlance).


 - shadow maps - it's more of an researched guess since it's largely based 
 on 
 a format support, but as far as i can tell all d3d10 hardware supports it, 
 earlier it varies (e.g. nvidia did it for ages)
Required for GL 1.4. I thought it was pretty much required for d3d
sm2.0, though you're right you could probably just not support the
texture format there. Anyway, most hardware should support it, I believe
even those which didn't really supported it at DX9 SM 2.0 time supported
it (chips like radeon r300 lacked the hw to do the comparison in the
texture unit, but it can be more or less easily implemented in the pixel
shader, though the implementation will suck as it certainly won't do PCF
just use some point sampling version - unless you're willing to do a
much more complex implementation in the pixel shader, but then on this
generation of hardware you might exceed maximum shader length). I
believe all hardware supporting SM 2.0 could at least do some sampling
of depth textures, though possibly only 16 bit and I'm not sure
filtering worked in all cases.


 
 I think the other stuff is acceptable. Take a look at the docs and let me 
 know 
 what you think.

What is feature level 1 useful for? I thought we'd really wanted DX9
level functionality as a bare minimum. GL2.x certainly needs cards
supporting shader model 2 (and that is a cheat, in reality it would be
shader model 3).

Also, I don't quite get the shader model levels. I thought there were
mainly two different DX9 versions, one with sm 2.0 the other with 3.0,
with noone caring about other differences (as most stuff was cappable
anyway). However, you've got 3 and all of them have 2.0 shader model?

More comments below.


 +static const enum pipe_feature_level
 +i915_feature_level(struct pipe_screen *screen)
 +{
 +   return PIPE_FEATURE_LEVEL_1;
 +}
What's the reason this is not feature level 2?


 +static const enum pipe_feature_level
 +nv30_screen_feature_level(struct pipe_screen *screen)
 +{
 +   return PIPE_FEATURE_LEVEL_1;
 +}
 +
Hmm in theory this should be feature level 2. Maybe the driver doesn't
quite cut it though...

 +static const enum pipe_feature_level r300_feature_level(
 +   struct pipe_screen* pscreen)
 +{
 +   if (r300screen-caps-is_r500) {
 +  return PIPE_FEATURE_LEVEL_2;
 +   } else {
 +  return PIPE_FEATURE_LEVEL_1;
 +   }
 +}
Shouldn't one be feature level 3 (or maybe 4?) the other 2?



 
 
 Profile 7 (2009)6 (2008)5 (2006)  
   4 (2004)3 (2003)2 (2002) 1 (2000)
 
 API Support DX11DX10.1  
 DX10/GL3.2  DX9.2   DX9.1   DX9.0DX7.0
 GL4.0   GL3.2+  GL3.2 
   GL3.0   GL2.x   GL2.xGL2.x
 VG  VG  VG
   VG  VG  VG   VG
 CL1.0   CL1.0   CL1.0
 
 Shader Model  5.0 4.x 4.0 
 2.0 2.0 2.0  1.0
   
   4_0_level_9_3   4_0_level_9_1   4_0_level_9_1
 
 Fragment Shader

Re: [Mesa3d-dev] RFC: gallium changes for conditional rendering

2010-01-04 Thread Roland Scheidegger
On 04.01.2010 15:48, Brian Paul wrote:
 Keith Whitwell wrote:
 On Thu, 2009-12-31 at 15:57 -0800, Brian Paul wrote:
 The BY_REGION modes indicate that it's OK for the GPU to discard the
 fragments in the region(s) which failed the occlusion test (perhaps
 skipping other per-fragment ops that would have otherwise occurred).
 See the spec at
 http://www.opengl.org/registry/specs/NV/conditional_render.txt for
 details.

 I'd be happy to omit those modes for now.  But since they're in the NV
 spec, I suspect NVIDIA hardware (at least) can make use of them.

 Brian,

 Lets leave them in - I'm presuming the no-op implementation which maps
 them down to the regular tokens is fine. 
 
 Yes.
 
 Incidentally, it would be fairly easy to take advantage of the 
 BY_REGION modes in the llvm driver.  If the number of samples passed 
 in a tile during occlusion testing is zero, the tile can be skipped 
 entirely when doing the conditional render.
 
 I'll check in these changes later today.

I think the main benefit for the by-region modes might have been saving
the vertex processing for the second GPU, but it's nice that these modes
seem useful for other cases as well.
(Remember for split-frame SLI, there will be two hardware occlusion
query results, one for each gpu, and by-region modes will make it
possible to run the rendering commands only on one when using
conditional render).

Roland

--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] [PATCH] [RFC] Remove PIPE_TEX_FILTER_ANISO to properly implement GL_EXT_texture_filter_anisotropic

2010-01-04 Thread Roland Scheidegger
On 01.01.2010 23:32, Luca Barbieri wrote:
 Currently Gallium defines a specific filtering mode for anisotropic
 filtering.
 
 This however prevents proper implementation of
 GL_EXT_texture_filter_anisotropic.
 
 The spec (written by nVidia) contains the following text:  A
 texture's maximum degree of anisotropy is specified independent from
 the texture's minification and magnification filter (as opposed to
 being supported as an entirely new filtering mode). Implementations
 are free to use the specified minification and magnification filter
 to select a particular anisotropic texture filtering scheme.  For
 example, a NEAREST filter with a maximum degree of anisotropy of two
 could be treated as a 2-tap filter that accounts for the direction of
 anisotropy.  Implementations are also permitted to ignore the
 minification or magnification filter and implement the highest
 quality of anisotropic filtering possible.
 
 and
 
  Should there be a particular anisotropic texture filtering
 minification and magnification mode?
 
 RESOLUTION:  NO.  The maximum degree of anisotropy should control 
 when anisotropic texturing is used.  Making this orthogonal to the
 minification and magnification filtering modes allows these settings
 to influence the anisotropic scheme used.  Yes, such an anisotropic
 filtering scheme exists in hardware.
 
 Gallium does the opposite, and this prevents use of nearest
 anisotropic filtering which is supported in nVidia hardware and also
 introduces redundant state.
 
 This patch removes PIPE_TEX_FILTER_ANISO. Anisotropic filtering is
 enabled if and only if max_anisotropy  1.0. Values between 0.0 and
 1.0, inclusive, of max_anisotropy are to be considered equivalent,
 and meaning to turn off anisotropic filtering.
 
 This approach has the small drawback of eliminating the possibility
 of enabling anisotropic filter on either minification or
 magnification separately, which Radeon hardware seems to support, is
 currently support by Gallium but not exposed to OpenGL. If this is
 actually useful it could be handled by splitting max_anisotropy in
 two values and adding an appropriate OpenGL extension.
 
 How does Radeon anisotropic magnification differ from linear
 magnification?

Note that different 3d apis have different requirements - ideally we
should be able to choose some state which suits all of them.
In particular, d3d10/11 have a separate filter mode for aniso (which
applies to all of min/mag/mip filters at the same time).
d3d9 also has special aniso filter, but it can be set separately for min
and mag - apart from aniso d3d9 also has some more filters like 4-sample
tent/guassian, all of them with undefined results if used as mip filter.
max aniso values with d3d can be from (uint) 1 to 16, and I haven't seen
hardware yet which could use float values for that.

So it seems for conformant d3d9 (but not d3d10) implementation you'll
need to be able to enable aniso for min/mag separately.

Meanwhile, you said
 This however prevents proper implementation of
 GL_EXT_texture_filter_anisotropic.
This isn't quite true - you've quoted it yourself Implementations are
also permitted to ignore the minification or magnification filter and
implement the highest quality of anisotropic filtering possible.

I don't think it's terribly useful to being able to enable anisotropic
filtering with other min/mag filters, and d3d never allowed it hence
hardware support for this will likely be rare. I don't really have a
strong opinion though if we should allow this in the api or not, I guess
it might make some drivers (except nvidia ones) (plus d3d state
trackers...) a bit more complicated but it shouldn't be too bad, and
maybe there's actually one app out there which would use it - or maybe
it'll give better results for things like forced aniso.

Roland




--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] Yet more r300g fear and loathing...

2009-12-21 Thread Roland Scheidegger
On 21.12.2009 15:13, Henri Verbeet wrote:
 2009/12/21 Corbin Simpson mostawesomed...@gmail.com:
 So, yet another thing that r300 sucks balls at: NPOT textures. We've
 been talking it over on IRC, and here's the options.

 1) Don't do NPOT. Stop advertising PIPE_CAP_NPOT, refuse to accept
 non-NPOT dimensions on textures. This sucks because it means that we
 don't get GL 2.0, which means most apps (bless their non-compliant
 souls) will refuse to attempt GLSL, which means that there's really no
 point in continuing this driver.

 2) Don't do NPOT in the pipe, but do it in the state tracker instead,
 as needed. Write up the appropriate fallbacks, and then let ARB_npot
 be advertised by the state tracker regardless of whether PIPE_CAP_NPOT
 is set. Lots of typing, though. Lots and lots of typing.

 3) Same as above, but put all the fallbacks in the pipe instead of the
 state tracker. I am *really* not fond of this, since PIPE_CAP was not
 intended for lies, but it was mentioned in IRC, so I gotta mention it
 here.

 3) The fglrx special: Don't require ARB_npot for advertising GL 2.0. I
 figured this wasn't on the table, but you never know...

 This is not really about where to implement the fallbacks, but as far
 as Wine is concerned, we'd mostly care about not triggering those if
 we can avoid them, e.g. by restrictions on clamping and filtering. We
 don't care much if GL 2.0 is supported or not, so a hypothetical
 MESA_conditional_npot extension would work for us. Other
 applications might care though, in which case an extension that allows
 us to query what situations trigger fallbacks would work for us as
 well.
 
 The fglrx solution mostly just sucks, for an important part because
 there's (afaik) no real documentation on what the restrictions are,
 and the reported extensions are now inconsistent with the reported GL
 version. That said, Wine has code to handle this case now, and I
 imagine other applications do as well.
This is a very common hardware problem, there's lots of hardware out
there which can do some (like r300) or even all glsl shaders but lack
true npot support. I suspect there might be a few apps which try to see
if ARB_texture_npot is supported, and if not, they'll assume that
functionality isn't supported even if the driver says GL 2.0. There's
certainly precedent for not announcing extensions even if you have to
support it for a given gl version, one prominent example would be the
nvidia GF3/4 cards which were GL 1.4 but couldn't do blend_func_separate
- they didn't announce support for EXT_blend_func_separate and just used
software fallback when they needed. So of course just not announcing
support for it isn't sufficient you still need to implement this somehow
(unless you just want to misrender...) but it might give apps a hint,
even though the API wasn't really designed for this. Sounds like it'll
just pollute things though. Last time I checked the extension mechanism
in gallium when used with dri state tracker was broken though and needed
some work anyway (because dri_init_extensions was called after
st_create_context, and the former just enables lots of extensions
regardless any cap bits, hence the extension string will have lots of
extensions which might not be supported).

Anyway, doing this in a utility module sounds good, though I'm not sure
what exactly you want to do. You could certainly fix up all texture
lookups in the shader by doing address calculations manually and so on,
but that gets a bit complicated quite soon I guess (in case of r300 it
probably also increases chances a shader won't fit onto hardware a lot).
Maybe misrendering things would still be an option, I think it would
mostly be clamp modes which wouldn't work correctly, since AFAIK you
could make even mipmaps work (on r300 at least).

Roland

--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] Yet more r300g fear and loathing...

2009-12-21 Thread Roland Scheidegger
The draw module approach can only work if the texcoords are used
directly for texture lookups, not for calculated coords (it should be
possible to detect these cases though).

Roland

On 21.12.2009 19:32, Keith Whitwell wrote:
 Faking those wrap modes is something that could be done either in the
 draw module (by decomposing triangles and adjusting the texcoords) or
 in the pixel shader (by adding logic to adjust the texcoord on a
 per-pixel basis).  Probably the draw-module approach is the easiest
 to implement and is appropriate for an infrequently used path - you
 still get hardware rasterization speeds, just a more expensive vertex
 path.
 
 Keith  From: Alex Deucher
 [alexdeuc...@gmail.com] Sent: Monday, December 21, 2009 10:18 AM To:
 tom fogal Cc: Mesa3D-Development Subject: Re: [Mesa3d-dev] Yet more
 r300g fear and loathing...
 
 I work on real-time visualization apps; the one in particular I'm 
 thinking of does texture sampling of potentially-NPOT textures via 
 GLSL.  If sampling a NPOT texture is not going to run in hardware, 
 the app is useless.  Further, our app keeps track of the amount of
 GL memory allocated for textures, FBOs and the like.  If a texture
 is going to be silently extended, that messes with our management
 routines [1].
 
 
 The hardware supports rectangular texture sampling.  What's missing
 is support for certain wrap modes and mipmaps with npot textures. 
 Neither of which are used that often.
 
 --
  This SF.Net email is sponsored by the Verizon Developer Community 
 Take advantage of Verizon's best-in-class app development support A
 streamlined, 14 day to market process makes app distribution fast and
 easy Join now and get one step closer to millions of Verizon
 customers http://p.sf.net/sfu/verizon-dev2dev 
 ___ Mesa3d-dev mailing
 list Mesa3d-dev@lists.sourceforge.net 
 https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


[Mesa3d-dev] gallium-edgeflags branch

2009-12-18 Thread Roland Scheidegger
Hello,

I plan to merge gallium-edgeflags branch soon.
I should have fixed up drivers syntactically, but note some will break
if applications use edgeflags. In particular the drivers which so far
have chosen to ignore edgeflags completely and don't have implemented a
fall back to use the draw module might break (I'm looking at you, r300
and nv30!...).
If those drivers want to continue to just have broken edgeflags support
but you just don't want them to crash, you'll need to fix them up so
they map the edgeflag output of the vertex shader to something halfway
meaningful for the hw, like a unneeded temp or so. But really the right
solution is to fix them so they use the draw module for things they
can't handle, like svga and nv40 do (or, of course, make them handle
edgeflags properly in hardware, but that might be dx10-class hardware
only which truly can do it).
Drivers for hardware without a hw vertex unit shouldn't have any
problem, since draw will handle everything for them.
You can use progs/trivial/tri-edgeflag for instance to see what happens.

Roland


--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] [PATCH] Fix u_pack_color.h rgb pack/unpack functions

2009-12-15 Thread Roland Scheidegger
On 15.12.2009 14:14, michal wrote:
 Guys,
 
 Does the attached patch make sense to you?
 
 I replaced the incomplete switch-cases with calls to u_format_access 
 functions that are complete but are going to be a bit more expensive to 
 call. Since they are used not very often in mesa state tracker, I 
 thought it's a good compromise.

They are not only used in state trackers, but drivers for instance as
well. That said, it's probably not really a performance critical path.
Though I'm not sure it makes sense to keep these functions even around
if they'll just do a single function call. Also, I'm pretty sure your
usage of the union isn't strict aliasing compliant (as far as I can tell
you could just go back and remove that ugly union again), though it's
probably one of the cases gcc won't complain (and hopefully won't
miscompile).

Roland

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] [PATCH] Fix u_pack_color.h rgb pack/unpack functions

2009-12-15 Thread Roland Scheidegger
On 15.12.2009 18:02, michal wrote:
 Roland Scheidegger pisze:
 On 15.12.2009 14:14, michal wrote:
   
 Guys,

 Does the attached patch make sense to you?

 I replaced the incomplete switch-cases with calls to u_format_access 
 functions that are complete but are going to be a bit more expensive to 
 call. Since they are used not very often in mesa state tracker, I 
 thought it's a good compromise.
 
 They are not only used in state trackers, but drivers for instance as
 well. That said, it's probably not really a performance critical path.
 Though I'm not sure it makes sense to keep these functions even around
 if they'll just do a single function call. Also, I'm pretty sure your
 usage of the union isn't strict aliasing compliant (as far as I can tell
 you could just go back and remove that ugly union again), though it's
 probably one of the cases gcc won't complain (and hopefully won't
 miscompile).

   
 I am casting to (void *) and then u_format casts it back to whatever it 
 needs to. I think I am innocent.
Casts to void * and back to something are only safe if the something
is the same as it initially was. Well in theory anyway. That's also
where some of the initial warnings came from, callers using some pointer
to unsigned, which then in the end got cast to ubyte * or whatever. An
intermediate cast to void * doesn't change anything. That said, the
callers probably couldn't have handled the formats not returning the
right type anyway.
Often though gcc won't complain about aliasing if you use some void *
pointer in a function call and cast it to something else than it was, I
think it usually won't be able to figure out what the original type was,
hence it needs to assume it can alias with anything.

 
 Anyway, I will go after Keith's suggestion and fill in only the 
 switch-default case. We can always nuke the special cases later when/if 
 we realise the performance impact can be neglected.
Yes, sounds good.

Roland

--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] r300 driver help needed

2009-12-14 Thread Roland Scheidegger
On 14.12.2009 10:29, michael wang wrote:
 Dear Mesa developers,
 
 I am learning OpenGL on my notebook (with an old ATI Radeon X600 video
 card), but I cannot get GL_LINE_STIPPLE work. It draws solid line only.
 
 glxinfo shows I'm using the R300 driver, and some study of the source
 code I find it fallback (to software rendering I suppose) when I
 enable GL_LINE_STIPPLE.
 
 So my question is:
 1. How can I check why my software rendering does not do line stipple?
You could try it with software mesa and see if it works there.
Also, the fallback for r300 only happens if you don't have
disable_lowimpact_fallbacks set, so if this is set for whatever reason
you will indeed get a solid line. If you set RADEON_DEBUG=fall it should
print out a warning if it hits that line stipple fallback

 2. Is R300 project still active? Is so, where should I report this bug to?
The project is still alive, if it's a driver bug and not your app you
could file a bug at bugs.freedesktop.org.

Roland

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] glsl-pp-rework-2 branch merge

2009-12-11 Thread Roland Scheidegger
On 09.12.2009 18:58, michal wrote:
 Keith Whitwell pisze:
 On Wed, 2009-12-09 at 09:16 -0800, michal wrote:
   
 Hi all,

 I would like to merge this branch back to master this week. If anoyone 
 could test if the build works on his/her system, it would be nice.

 Thanks.
 
 Michal,

 Can you detail what testing you've done on this branch and which
 environments you have/haven't built on?


   
 Testing:
 
 * Capture the output of the old syntax parser and comapre with the 
 output of the new parser. No regressions found. Use a set of over 400 
 shaders to perform the comparison.
 
 * Run GLSL Parser Test to see if the new parser successfully intergrates 
 with the rest of Mesa. No regressions found.
 
 So far I have been building that with scons on windows. I am planning to 
 fix the build with make and scons on linux.
Seems to compile just fine now with make.
Too bad all the strict-aliasing violations are still there (in
grammar.c), I'll give this a look (but don't wait for it for merging).
Also, there seems to be some char/byte uncleanliness, I get a gazillion
warnings like:
shader/grammar/grammar.c: In function ‘get_spec’:
shader/grammar/grammar.c:1978: warning: pointer targets in passing
argument 1 of ‘strlen’ differ in signedness
/usr/include/string.h:397: note: expected ‘const char *’ but argument is
of type ‘byte *’
shader/grammar/grammar.c:1978: warning: pointer targets in passing
argument 1 of ‘__builtin_strcmp’ differ in signedness
shader/grammar/grammar.c:1978: note: expected ‘const char *’ but
argument is of type ‘byte *’
shader/grammar/grammar.c:1978: warning: pointer targets in passing
argument 1 of ‘strlen’ differ in signedness

Roland

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] glsl-pp-rework-2 branch merge

2009-12-09 Thread Roland Scheidegger
On 09.12.2009 18:16, michal wrote:
 Hi all,
 
 I would like to merge this branch back to master this week. If anoyone 
 could test if the build works on his/her system, it would be nice.
Good stuff!
Looks like only scons build system is working though.

Roland



--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] Branch pipe-format-simplify open for review

2009-12-08 Thread Roland Scheidegger
On 08.12.2009 15:55, michal wrote:
 This branch simplifies pipe/p_format.h by making enum pipe_format what 
 it should have been -- an enum.
 
 As a result there is no extra information encoded in it and one needs to 
 use auxiliary/util/u_format.h to get that info instead. Linking to the 
 auxiliary/util lib is necessary.
 
 Please review and if you can test if it doesn't break your setup, I will 
 appreciate it.
 
 I would like to hear from r300 and nouveau guys, as those drivers were 
 using some internal macros and I weren't 100% sure I got the conversion 
 right.

Looks nice, though it is unfortunately based on pre gallium-noblocks
merge, so I suspect you'll get a conflict for almost every patch chunk
at least in drivers if you try to merge it...

Roland

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] Branch pipe-format-simplify open for review

2009-12-08 Thread Roland Scheidegger
On 08.12.2009 16:49, michal wrote:
 Roland Scheidegger pisze:
 On 08.12.2009 15:55, michal wrote:
   
 This branch simplifies pipe/p_format.h by making enum pipe_format what 
 it should have been -- an enum.

 As a result there is no extra information encoded in it and one needs to 
 use auxiliary/util/u_format.h to get that info instead. Linking to the 
 auxiliary/util lib is necessary.

 Please review and if you can test if it doesn't break your setup, I will 
 appreciate it.

 I would like to hear from r300 and nouveau guys, as those drivers were 
 using some internal macros and I weren't 100% sure I got the conversion 
 right.
 
 Looks nice, though it is unfortunately based on pre gallium-noblocks
 merge, so I suspect you'll get a conflict for almost every patch chunk
 at least in drivers if you try to merge it...

   
 I didn't touch pipe blocks -- I left the pf_getblock* and friends in 
 pipe_format.h intact.
Yes, but you're bound to get lots of conflicts because you replaced for
instance pf_format_get_block with util_format_get_block whereas that
stuff is removed from master because pipe_format_block (and the
block/nblocksx/nblocksy variables in pipe_texture and pipe_transfer) are
gone completely.
I quickly tried a merge and there were conflicts in over 40 files - from
a quick glance though they should be trivial to resolve. And I don't
think there's too much hidden stuff which won't work any longer - just
let util_format_get_block die and it should probably work out ok.

 
 How severe is the gallium-noblocks change? I would like to avoid mergin 
 master into this branch.
It's not really that severe, it just touched a lot of the same places in
drivers this change does.
btw, I also avoided merging master to feature branch when I merged
gallium-noblocks, and instead fixed up conflicts on merge to master (and
adpated stuff which needed changes later). Is there some policy for this?

Roland

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] gallium-strict-aliasing branch merge

2009-12-08 Thread Roland Scheidegger
Keith,

I think there might be some slight issue with some of the changes in the
drivers I did. In particular, I was under the impression it would be ok
to do something like
union a_union {
  int i;
  double d;
};
int f() {
   double d = 3.0;
   return ((union a_union *) d)-i;
};
but in fact gcc manpage tells me it's not (the example is from gcc 4.4
manpage) - this site told me this is ok, casting through a union (2)
http://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-aliasing.html,
I guess it was considered ok in 2006 but not now (though I'm not sure
why not)... I did that in some places because otherwise there's no way
around assigning the value to the union and pass that around instead.
Curiously though, despite the gcc manpage saying the code might not be
ok, gcc doesn't warn about it in the places I used it.
Anyway, I'm not sure it's worth bothering with this now, as drivers
could be fixed up without any interface changes.

Roland


unsigned i =
On 08.12.2009 17:19, Keith Whitwell wrote:
 Roland,
 
 This looks OK to me, hopefully this will see us getting on top of strict
 aliasing issues after all these years...
 
 Keith
 
 On Mon, 2009-12-07 at 18:14 -0800, Roland Scheidegger wrote:
 Hello,

 I'm planning to merge gallium-strict-aliasing branch soon, which will
 bring another gallium api change.
 pipe_reference function has different arguments, because the old version
 was pretty much not really useful for strict-aliasing compliant code
 (util_color_pack functions also gets an update for the same reason).
 The goal of course it to enable builds which do no longer need
 -fno-strict-aliasing. scons builds already didn't do this (which was a
 bug since the builds were indeed broken).
 I didn't check all drivers for strict-aliasing compliance, but for
 gallium everybody should make sure the code they are submitting is
 according to strict aliasing rules (*). One downside of compiling with
 -fno-strict-aliasing is also that you don't get the warnings wrt strict
 aliasing, so you might have missed that in the past.
 (There are no build system changes yet, there's still some strict
 aliasing violating code in shader/grammar which should get replaced soon
 anyway.)

 (*) Strictly speaking, it looks like c99 actually has undefined
 behaviour writing and reading different members of a union (wtf?), but
 this is considered acceptable here, and all compilers should support it.

 Roland

 --
 Return on Information:
 Google Enterprise Search pays you back
 Get the facts.
 http://p.sf.net/sfu/google-dev2dev
 ___
 Mesa3d-dev mailing list
 Mesa3d-dev@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
 


--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] gallium-strict-aliasing branch merge

2009-12-08 Thread Roland Scheidegger
On 08.12.2009 17:37, Keith Whitwell wrote:
 On Tue, 2009-12-08 at 08:31 -0800, Roland Scheidegger wrote:
 Keith,

 I think there might be some slight issue with some of the changes in the
 drivers I did. In particular, I was under the impression it would be ok
 to do something like
 union a_union {
   int i;
   double d;
 };
 int f() {
double d = 3.0;
return ((union a_union *) d)-i;
 };
 but in fact gcc manpage tells me it's not (the example is from gcc 4.4
 manpage) - this site told me this is ok, casting through a union (2)
 http://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-aliasing.html,
 I guess it was considered ok in 2006 but not now (though I'm not sure
 why not)... I did that in some places because otherwise there's no way
 around assigning the value to the union and pass that around instead.
 Curiously though, despite the gcc manpage saying the code might not be
 ok, gcc doesn't warn about it in the places I used it.
 Anyway, I'm not sure it's worth bothering with this now, as drivers
 could be fixed up without any interface changes.
 
 Is it a lot of extra work to fix?  I wouldn't mind getting on top of
 this once and for all.

Not in the places I touched. It'll just make the code uglier, though at
least the compiler might still optimize extra assignments away.
For example in st_atom_pixeltransfer.c it now looks like this:
util_pack_color_ub(r, g, b, a, pt-format, (union util_color *)(dest + k));
and I'd need to change it to:
union util_color uc;
util_pack_color_ub(r, g, b, a, pt-format, uc);
*(dest + k) = uc.ui;
Ok, not really a lot more ugly.
Will do this then, though there are other places where things like that
might already be used, and since the compiler does not issue any
warnings it might be a bit time consuming to find all of them.

Roland


--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] gallium-strict-aliasing branch merge

2009-12-08 Thread Roland Scheidegger
On 08.12.2009 18:12, Roland Scheidegger wrote:
 On 08.12.2009 17:37, Keith Whitwell wrote:
 On Tue, 2009-12-08 at 08:31 -0800, Roland Scheidegger wrote:
 Keith,

 I think there might be some slight issue with some of the changes in the
 drivers I did. In particular, I was under the impression it would be ok
 to do something like
 union a_union {
   int i;
   double d;
 };
 int f() {
double d = 3.0;
return ((union a_union *) d)-i;
 };
 but in fact gcc manpage tells me it's not (the example is from gcc 4.4
 manpage) - this site told me this is ok, casting through a union (2)
 http://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-aliasing.html,
 I guess it was considered ok in 2006 but not now (though I'm not sure
 why not)... I did that in some places because otherwise there's no way
 around assigning the value to the union and pass that around instead.
 Curiously though, despite the gcc manpage saying the code might not be
 ok, gcc doesn't warn about it in the places I used it.
 Anyway, I'm not sure it's worth bothering with this now, as drivers
 could be fixed up without any interface changes.
 Is it a lot of extra work to fix?  I wouldn't mind getting on top of
 this once and for all.
 
 Not in the places I touched. It'll just make the code uglier, though at
 least the compiler might still optimize extra assignments away.
 For example in st_atom_pixeltransfer.c it now looks like this:
 util_pack_color_ub(r, g, b, a, pt-format, (union util_color *)(dest + k));
 and I'd need to change it to:
 union util_color uc;
 util_pack_color_ub(r, g, b, a, pt-format, uc);
 *(dest + k) = uc.ui;
 Ok, not really a lot more ugly.
 Will do this then, though there are other places where things like that
 might already be used, and since the compiler does not issue any
 warnings it might be a bit time consuming to find all of them.

Ok, unfortunately code in vg_translate.c got a lot more verbose :-(.
Also, I think there's quite some usage of casting void * to other types.
That could also lead to strict-aliasing violations, as you're only
allowed to do casts back to the original type it had (hence the
strict-aliasing warnings if you do *((float *) (void *)
some-uint-value), because the compiler is able to determine original
type). Might be safe though as long as gcc doesn't do too much
interprocedural optimizations, and if it does it should probably be able
to at least output a warning, since in this case it should also be able
to determine the original type I guess...

Roland

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] gallium-strict-aliasing branch merge

2009-12-08 Thread Roland Scheidegger
On 08.12.2009 20:57, Martin Olsson wrote:
 Roland Scheidegger wrote:
 Keith,

 I think there might be some slight issue with some of the changes in the
 drivers I did. In particular, I was under the impression it would be ok
 to do something like
 union a_union {
   int i;
   double d;
 };
 int f() {
double d = 3.0;
return ((union a_union *) d)-i;
 };
 but in fact gcc manpage tells me it's not (the example is from gcc 4.4
 manpage) 
 
 I think the issue you are describing is explained here:
 http://patrakov.blogspot.com/2009/03/dont-use-old-dtoac.html
Yes, probably. Note though it says gcc generates warnings for it, which
didn't happen, so I think gcc would actually not miscompile it.
(I suspect gcc doesn't complain and does not miscompile as long as it
can't determine the original type of the value). Still, the explanation
is imho not really satisfactory. I think a lot of people used to think
this would be perfectly fine (see for instance
http://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-aliasing.html
casting through a union (2)).


 Also note the link he posts to the GCC manual:
 http://gcc.gnu.org/onlinedocs/gcc-4.3.2/gcc/Optimize-Options.html#index-fstrict_002daliasing-721
Yep, that's the same stuff I used for the example.

Roland

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


[Mesa3d-dev] gallium-strict-aliasing branch merge

2009-12-07 Thread Roland Scheidegger
Hello,

I'm planning to merge gallium-strict-aliasing branch soon, which will
bring another gallium api change.
pipe_reference function has different arguments, because the old version
was pretty much not really useful for strict-aliasing compliant code
(util_color_pack functions also gets an update for the same reason).
The goal of course it to enable builds which do no longer need
-fno-strict-aliasing. scons builds already didn't do this (which was a
bug since the builds were indeed broken).
I didn't check all drivers for strict-aliasing compliance, but for
gallium everybody should make sure the code they are submitting is
according to strict aliasing rules (*). One downside of compiling with
-fno-strict-aliasing is also that you don't get the warnings wrt strict
aliasing, so you might have missed that in the past.
(There are no build system changes yet, there's still some strict
aliasing violating code in shader/grammar which should get replaced soon
anyway.)

(*) Strictly speaking, it looks like c99 actually has undefined
behaviour writing and reading different members of a union (wtf?), but
this is considered acceptable here, and all compilers should support it.

Roland

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] [RFC] Move _mesa_memcpy to imports.h and inline it

2009-12-04 Thread Roland Scheidegger
On 04.12.2009 11:24, Kenneth Graunke wrote:
 On Thursday 03 December 2009 12:47:36 Brian Paul wrote:
 [snip]
 I've been meaning to go over imports.[ch] and make a bunch of the
 wrapper functions inlines.

 A lot of the wrappers aren't needed any more.  Back before valgrind I
 used the memory-related wrappers quite often.  For now, let's keep the
 wrappers so we don't have to touch tons of other files right away.

 Matt, feel free to submit a patch.

 -Brian
 
 I've attached patches to remove a number of the wrappers, should you decide 
 you want to go that way.
 

 diff --git a/src/mesa/main/imports.c b/src/mesa/main/imports.c
 index 6a34aec..0f10111 100644
 --- a/src/mesa/main/imports.c
 +++ b/src/mesa/main/imports.c
 @@ -268,17 +268,6 @@ _mesa_bzero( void *dst, size_t n )
  #endif
  }
  
 -/** Wrapper around memcmp() */
 -int
 -_mesa_memcmp( const void *s1, const void *s2, size_t n )
 -{
 -#if defined(SUNOS4)
 -   return memcmp( (char *) s1, (char *) s2, (int) n );
 -#else
 -   return memcmp(s1, s2, n);
 -#endif
 -}
 -
  /*...@}*/

So is the different implementation on SUNOS4 no longer relevant?

Roland


--
Join us December 9, 2009 for the Red Hat Virtual Experience,
a free event focused on virtualization and cloud computing. 
Attend in-depth sessions from your desk. Your couch. Anywhere.
http://p.sf.net/sfu/redhat-sfdev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] mesa/gallium strict aliasing bugs

2009-12-03 Thread Roland Scheidegger
On 03.12.2009 01:38, Jose Fonseca wrote:
 Interesting. Yes we want to fix the problem, as we're missing out
 potential optimizations.
 
 For fixing reference counting, couldn't we fix it by doing the final
 
 
 *pdst = src;
 
 in each pipe_xxx_reference function in the bottom of p_state.h, and
 pass only (*pdst)-reference, src-reference to the p_refcnt.h's
 pipe_reference() instead (i.e., just pointers, and no pointer to
 pointers)? I haven't tested it but it seems that that would eliminate
 all casts, hence should be correct.

That is pretty much what I did (except I used a temporary pointer to
pass its address to pipe_reference so I didn't have to change the
pipe_reference function at all, so it still worked save the aliasing
issues for other callers).
Some callers use this directly, I can fix them I guess it's just a bit
less convenient function, if this is the approach we'll take.

Roland

 
 Jose
 
  From: Roland Scheidegger
 [srol...@vmware.com] Sent: Wednesday, December 02, 2009 23:19 To:
 Jose Fonseca Cc: mesa3d-...@lists.sf.net Subject: Re: [Mesa3d-dev]
 mesa/gallium strict aliasing bugs
 
 On 02.12.2009 18:33, José Fonseca wrote:
 I've seen strict aliasing assumption causing bugs in other gallium 
 components. It seems endemic to our code.  Unless we actively
 decidee to go and chase the strict aliasing bugs now we should add 
 -fno-strict-aliasing to all our builds.
 
 Do we ever want to fix strict aliasing? If we do, I think the problem
  with refcounting is pretty fundamental (I traced the crash to
 aliasing problems there, and hacked up some bogus version which
 didn't segfault for the testcase I used). At least I can't see a way
 to make this really work in some nice way. Supposedly gcc supports 
 __attribute__((__may_alias__)) but I had no luck with it. In
 gallium (not core mesa) there's only one other offender causing a 
 large amount of warnings, that is util_pack_color, and I think it
 won't actually cause problems.


--
Join us December 9, 2009 for the Red Hat Virtual Experience,
a free event focused on virtualization and cloud computing. 
Attend in-depth sessions from your desk. Your couch. Anywhere.
http://p.sf.net/sfu/redhat-sfdev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] mesa/gallium strict aliasing bugs

2009-12-03 Thread Roland Scheidegger
On 03.12.2009 11:17, Keith Whitwell wrote:
 On Wed, 2009-12-02 at 12:46 -0800, Roland Scheidegger wrote:
 On 02.12.2009 18:33, José Fonseca wrote:
 I've seen strict aliasing assumption causing bugs in other gallium
 components. It seems endemic to our code.  Unless we actively decidee to
 go and chase the strict aliasing bugs now we should add
 -fno-strict-aliasing to all our builds.
 Hmm, actually some of them (in mesa at least) seem to be really
 unnecessary. Take the COPY_4FV macro for instance. I replaced that in a
 simple test program (attached) to either just do direct assignment
 without cast, or use memcpy instead.
 
 That comment was probably true in 1999 -- but possibly not any longer...
 
 The results are actually interesting, the comment says cast is done to
 avoid going through fp registers, but looking in the assembly (at least
 with optimization) that doesn't happen anyway, and the generated code is
 actually nearly identical, but in fact it not only triggers
 strict-aliasing warnings but doesn't work correctly (when compiled with
 -O3 or similar parameters invoking -fstrict-aliasing).
 
 ...
 
 Doesn't use 128bit sse moves but looks like an improvement... When using
 no optimization code certainly gets much less readable and the memcpy
 version will call glibc memcpy (which itself will still be optimized
 hence probably faster despite the function call).
 So I'll kill at least this one and just use _mesa_memcpy there, unless
 there are good reasons not to. I think pretty much all compilers should
 have builtin memcpy optimizations.
 
 I didn't realize COPY_4FV and friends were related to our strict
 aliasing problems -- if that's the case, let's kill or reimplement them
 straight away.
Actually, that was the simplest one, and most other of these macros
don't do this. There's also plenty of warnings in the shader/grammar
code, apart from that there's actually not that many warnings, at least
when not compiling legacy drivers... So I guess getting rid of
strict-aliasing issues is doable for gallium.

Roland

--
Join us December 9, 2009 for the Red Hat Virtual Experience,
a free event focused on virtualization and cloud computing. 
Attend in-depth sessions from your desk. Your couch. Anywhere.
http://p.sf.net/sfu/redhat-sfdev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] gallium-noblocks branch merge

2009-12-03 Thread Roland Scheidegger
On 03.12.2009 20:55, Christoph Bumiller wrote:
 Roland Scheidegger schrieb:
 Hi,

 I'm planning to merge gallium-noblocks branch to master soon. This api
 change may affect your driver, statetracker, whatever. I _should_ have
 fixed up all in tree stuff using it, but that's not a guarantee it will
 still run correctly (nv50 driver was strange for instance), and
 What's strange with nv50 ?
 There's this one if (!pt-nblocksx[level]) { in nv50_transfer.c that
 was an unnecessary leftover because I hadn't seen miptree_blanket forgot
 the initialize these and pushed a bit too early, thankfully this is now
 gone automatically.
Ok, this is mostly what was strange. That and it was the driver which by
far needed the most changes :-).

 
 I just need the y blocks everywhere instead of just y because things
 like offset = stride * y is simply wrong if you have *actual*
 multi-pixel blocks (pitch as in nblocksx * width).
Yes, drivers are encouraged to use the block helpers. This way they
don't need to special case any formats, as it should work for
uncompressed, dxt, or things like ycbcr just the same.

 I hope no one will try to transfer just parts of a block (makes not much
 sense for DXT imo though).
Yes, this shouldn't happen. Neither ogl nor dx should trigger this (it's
not allowed for CompressedTexSubImage), so transfers are required to
only happen along block boundaries.

Roland

--
Join us December 9, 2009 for the Red Hat Virtual Experience,
a free event focused on virtualization and cloud computing. 
Attend in-depth sessions from your desk. Your couch. Anywhere.
http://p.sf.net/sfu/redhat-sfdev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] [RFC] Move _mesa_memcpy to imports.h and inline it

2009-12-03 Thread Roland Scheidegger
On 03.12.2009 19:46, Matt Turner wrote:
 Most of the functions in imports.c are very small, to the function
 call overhead is large relative to their size. Can't we do something
 like in the attached patch and move them to imports.h and mark them
 static inline? Things like memcpy and memset are often optimized out
 by the compiler, and by hiding them in these wrapper functions, we're
 probably losing any benefits we'd otherwise see.
++ from me, at least for the very simple wrappers. _mesa_memcpy
especially I think can be very nicely used for array assignments and the
like, and in case of (very) small amounts of data to copy call overhead
might be significant.

 Similarly, if we're
 going to use a magic sqrtf algorithm (apparently for speed) then
 shouldn't we also let the compiler properly inline the function?
Not sure here, the function is still quite complex, I don't think call
overhead will make any difference. I've looked at the code though when
it wasn't using the fast path (with -O3 but DEBUG - why is this different?)
This version though adds a lot of overhead:
- call overhead for _mesa_sqrtf
- overhead converting to double
- overhead converting back
In the generated code the actual sqrtf code was a single assembly
instruction (sqrtsd %xmm0, %xmm0) - granted that's SSE2 only, and it
requires quite a few cycles. Still, I guess the overhead is significant,
not to mention that if we'd just use a float instead of double not only
we wouldn't have to convert the type but the compiler would actually
issue sqrtss %xmm0 %xmm0 instead, which is (depending on the cpu) twice
as fast. Not sure why we use double there, are there platforms where
sqrtf (float x) isn't supported?
So really, call overhead is a tiny fraction of the optimization
potential for this function. When not using DEBUG (and USE_IEEE is
defined) the function is still quite a few cycles, so call overhead
doesn't look that bad neither. I don't actually know which version is
faster (or more accurate - I think though sqrtss is actually fully
accurate). Of course using sqrtf(x) will only be fast if the cpu
supports some kind of fast float unit (and the compiler knows how to use
it).
If you'd want to do some more optimization, there's for instance
_mesa_inv_sqrtf - it is supposedly fast, but sse2 offers rsqrtss, which
is really fast. However, I remember we got some bugs some time ago when
gcc actually used that, because precision wasn't enough - it will do
this if you enable -funsafe-math-optimizations, -mrecip or similar. I've
just seen though actually that at least gcc 4.4 does an additional
newton-raphson step when you do 1.0/sqrtf(float x) (so it will issue
rsqrtss plus a couple muls and adds), which might still be less or even
more accurate, and almost certainly be faster than the manual version.
So there's probably far more optimization potential than the call
overhead. Most of those functions are probably never used in any
performance critical path anyway.


 
 I also don't quite understand wrapper functions like
 double
 _mesa_pow(double x, double y)
 {
return pow(x, y);
 }
 
 Maybe at one time these had #ifdefs in them like _mesa_memcpy, but I
 can't see any reason not to remove it now.
 
 Someone enlighten me.
I guess there might have been indeed #ifdefs in the past. In any case,
using wrapper would make it easier to implement such optimizations in
the future if anyone wants to, not that this is something which you
probably want to do (that stuff is probably better left up to the
compiler). So, at least if they are inlined, they shouldn't really hurt
neither.


--
Join us December 9, 2009 for the Red Hat Virtual Experience,
a free event focused on virtualization and cloud computing. 
Attend in-depth sessions from your desk. Your couch. Anywhere.
http://p.sf.net/sfu/redhat-sfdev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


[Mesa3d-dev] mesa/gallium strict aliasing bugs

2009-12-02 Thread Roland Scheidegger
Hi,

I've come across some bug (which I thought might be related to the
gallium-noblocks branch, but it's not) which caused a segfault but only
when not using debug builds. I think this is the same issue Vinson was
seeing some time ago. Looks like a impossible backtrace:

#0  st_texture_image_copy (pipe=0x612640, dst=0x0,
dstLevel=value optimized out, src=0x6e1dd0, face=0)
at src/mesa/state_tracker/st_texture.c:306
#1  0x7759b383 in copy_image_data_to_texture (
ctx=value optimized out, pipe=value optimized out, tObj=0x6919d0,
needFlush=value optimized out)
at src/mesa/state_tracker/st_cb_texture.c:1673
#2  st_finalize_texture (ctx=value optimized out,
pipe=value optimized out, tObj=0x6919d0, needFlush=value
optimized out)
at src/mesa/state_tracker/st_cb_texture.c:1807
#3  0x7758fd9d in finalize_textures (st=0x68a9c0)
at src/mesa/state_tracker/st_atom_texture.c:144

Segfault seems to be because dst is 0x0, but if you look at the call
stack it is easy to see this is impossible.
That would point to gcc optimizer issue (using gcc 4.4.1), except there
are quite a few warnings during compile, especially about violating
strict-aliasing rules...
So, in the gallium.py scons file there's actually this:
if debug:
ccflags += ['-O0', '-g3']
elif env['CCVERSION'].startswith('4.2.'):
# gcc 4.2.x optimizer is broken
print warning: gcc 4.2.x optimizer is broken -- disabling
optimizations
ccflags += ['-O0', '-g3']
else:
ccflags += ['-O3', '-g3']


So I added -fno-strict-aliasing and indeed, segfault is gone. Hence I
believe this is incorrectly accusing the gcc 4.2 optimizer, whereas it's
actually a code bug, and certainly it is not restricted to gcc 4.2
(unless this addressed a different problem).
Not quite sure though why the code violates strict-aliasing rules though
in all places - about half of the warnings are from pipe_reference,
(p_refcount.h:85). Not sure if all warnings are actually real issues,
and not sure how this should be fixed (should we try to fix this for
real or just force -fno-strict-aliasing).

Roland

--
Join us December 9, 2009 for the Red Hat Virtual Experience,
a free event focused on virtualization and cloud computing. 
Attend in-depth sessions from your desk. Your couch. Anywhere.
http://p.sf.net/sfu/redhat-sfdev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] mesa/gallium strict aliasing bugs

2009-12-02 Thread Roland Scheidegger
On 02.12.2009 18:33, José Fonseca wrote:
 On Wed, 2009-12-02 at 09:05 -0800, Roland Scheidegger wrote:
 Hi,

 I've come across some bug (which I thought might be related to the
 gallium-noblocks branch, but it's not) which caused a segfault but only
 when not using debug builds. I think this is the same issue Vinson was
 seeing some time ago. Looks like a impossible backtrace:

 #0  st_texture_image_copy (pipe=0x612640, dst=0x0,
 dstLevel=value optimized out, src=0x6e1dd0, face=0)
 at src/mesa/state_tracker/st_texture.c:306
 #1  0x7759b383 in copy_image_data_to_texture (
 ctx=value optimized out, pipe=value optimized out, tObj=0x6919d0,
 needFlush=value optimized out)
 at src/mesa/state_tracker/st_cb_texture.c:1673
 #2  st_finalize_texture (ctx=value optimized out,
 pipe=value optimized out, tObj=0x6919d0, needFlush=value
 optimized out)
 at src/mesa/state_tracker/st_cb_texture.c:1807
 #3  0x7758fd9d in finalize_textures (st=0x68a9c0)
 at src/mesa/state_tracker/st_atom_texture.c:144

 Segfault seems to be because dst is 0x0, but if you look at the call
 stack it is easy to see this is impossible.
 That would point to gcc optimizer issue (using gcc 4.4.1), except there
 are quite a few warnings during compile, especially about violating
 strict-aliasing rules...
 So, in the gallium.py scons file there's actually this:
 if debug:
 ccflags += ['-O0', '-g3']
 elif env['CCVERSION'].startswith('4.2.'):
 # gcc 4.2.x optimizer is broken
 print warning: gcc 4.2.x optimizer is broken -- disabling
 optimizations
 ccflags += ['-O0', '-g3']
 else:
 ccflags += ['-O3', '-g3']


 So I added -fno-strict-aliasing and indeed, segfault is gone. Hence I
 believe this is incorrectly accusing the gcc 4.2 optimizer, whereas it's
 actually a code bug, and certainly it is not restricted to gcc 4.2
 (unless this addressed a different problem).
 
 It addressed a different problem. Type 
 
   git show bb8f3090ba37aa3f24943fdb43c4120776289658
 
 to see explanation of it.
Ok.

 
 Not quite sure though why the code violates strict-aliasing rules though
 in all places - about half of the warnings are from pipe_reference,
 (p_refcount.h:85). Not sure if all warnings are actually real issues,
 and not sure how this should be fixed (should we try to fix this for
 real or just force -fno-strict-aliasing).
 
 I read (forgot where) that gcc strict aliasing warnings don't catch all
 cases.
gcc man page states this (-Wstrict-aliasing=n). Says though (gcc 4.4.1)
with n=3 (default) there should be very few false positives and few
false negatives.

 
 I've seen strict aliasing assumption causing bugs in other gallium
 components. It seems endemic to our code.  Unless we actively decidee to
 go and chase the strict aliasing bugs now we should add
 -fno-strict-aliasing to all our builds.
ok. I guess though there's no guarantee it won't break other compilers
where we don't have set any flags for this.

Roland


--
Join us December 9, 2009 for the Red Hat Virtual Experience,
a free event focused on virtualization and cloud computing. 
Attend in-depth sessions from your desk. Your couch. Anywhere.
http://p.sf.net/sfu/redhat-sfdev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] mesa/gallium strict aliasing bugs

2009-12-02 Thread Roland Scheidegger
On 02.12.2009 18:33, José Fonseca wrote:
 I've seen strict aliasing assumption causing bugs in other gallium
 components. It seems endemic to our code.  Unless we actively decidee to
 go and chase the strict aliasing bugs now we should add
 -fno-strict-aliasing to all our builds.

Hmm, actually some of them (in mesa at least) seem to be really
unnecessary. Take the COPY_4FV macro for instance. I replaced that in a
simple test program (attached) to either just do direct assignment
without cast, or use memcpy instead.
The results are actually interesting, the comment says cast is done to
avoid going through fp registers, but looking in the assembly (at least
with optimization) that doesn't happen anyway, and the generated code is
actually nearly identical, but in fact it not only triggers
strict-aliasing warnings but doesn't work correctly (when compiled with
-O3 or similar parameters invoking -fstrict-aliasing).

assign_cast:
.LFB45:
.cfi_startproc
movl(%rsi), %edx
leaq4(%rdi), %rax
movl%edx, 4(%rdi)
movl4(%rsi), %edx
movl%edx, 4(%rax)
movl8(%rsi), %edx
movl%edx, 8(%rax)
movl12(%rsi), %edx
movl%edx, 12(%rax)
ret
.cfi_endproc

assign:
.LFB46:
.cfi_startproc
movl(%rsi), %eax
movl%eax, 4(%rdi)
movl4(%rsi), %eax
movl%eax, 8(%rdi)
movl8(%rsi), %eax
movl%eax, 12(%rdi)
movl12(%rsi), %eax
movl%eax, 16(%rdi)
ret
.cfi_endproc


But clearly using memcpy the compiler does a better job:
assign_cpy:
.LFB44:
.cfi_startproc
movq(%rsi), %rax
movq%rax, 4(%rdi)
movq8(%rsi), %rax
movq%rax, 12(%rdi)
ret
.cfi_endproc
.LFE44:

Doesn't use 128bit sse moves but looks like an improvement... When using
no optimization code certainly gets much less readable and the memcpy
version will call glibc memcpy (which itself will still be optimized
hence probably faster despite the function call).
So I'll kill at least this one and just use _mesa_memcpy there, unless
there are good reasons not to. I think pretty much all compilers should
have builtin memcpy optimizations.

Roland
#include string.h
#include stdio.h

#define COPY_4FV( DST, SRC )  \
do {  \
   const unsigned *_s = (const unsigned *) (SRC); \
   unsigned *_d = (unsigned *) (DST); \
   _d[0] = _s[0]; \
   _d[1] = _s[1]; \
   _d[2] = _s[2]; \
   _d[3] = _s[3]; \
} while (0)

#define COPY_4FV_NOCAST( DST, SRC )   \
do {  \
   (DST)[0] = (SRC)[0]; \
   (DST)[1] = (SRC)[1]; \
   (DST)[2] = (SRC)[2]; \
   (DST)[3] = (SRC)[3]; \
} while (0)

#define COPY_4FV_MEMCPY( DST, SRC )   \
do {  \
   memcpy(DST, SRC, sizeof(float) * 4);\
} while (0)

struct sfloat
{
   unsigned unused;
   float p[4];
};

void assign_cpy(struct sfloat *s, float *param)
{
   COPY_4FV_MEMCPY(s-p, param);
}

void assign_cast(struct sfloat *s, float *param)
{
   COPY_4FV(s-p, param);
}

void assign(struct sfloat *s, float *param)
{
   COPY_4FV_NOCAST(s-p, param);
}

int main(void)
{
   float fl[4] = {0.1,0.2,0.3,0.4};
   struct sfloat s1;
   struct sfloat s2;
   struct sfloat s3;
   assign(s1, fl);
   fprintf(stderr, assigned values are %f %f %f %f\n, s1.p[0], s1.p[1], s1.p[2], s1.p[3]);
   assign_cpy(s2, fl);
   fprintf(stderr, assigned values are %f %f %f %f\n, s2.p[0], s2.p[1], s2.p[2], s2.p[3]);
   assign_cast(s3, fl);
   fprintf(stderr, assigned values are %f %f %f %f\n, s3.p[0], s3.p[1], s3.p[2], s3.p[3]);
   return 0;
}--
Join us December 9, 2009 for the Red Hat Virtual Experience,
a free event focused on virtualization and cloud computing. 
Attend in-depth sessions from your desk. Your couch. Anywhere.
http://p.sf.net/sfu/redhat-sfdev2dev___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


[Mesa3d-dev] gallium-noblocks branch merge

2009-12-02 Thread Roland Scheidegger
Hi,

I'm planning to merge gallium-noblocks branch to master soon. This api
change may affect your driver, statetracker, whatever. I _should_ have
fixed up all in tree stuff using it, but that's not a guarantee it will
still run correctly (nv50 driver was strange for instance), and
certainly if you have out of tree things they will break.
The changes themselves should be fairly simple, you can read more about
them in the git log file.

Roland

--
Join us December 9, 2009 for the Red Hat Virtual Experience,
a free event focused on virtualization and cloud computing. 
Attend in-depth sessions from your desk. Your couch. Anywhere.
http://p.sf.net/sfu/redhat-sfdev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] mesa/gallium strict aliasing bugs

2009-12-02 Thread Roland Scheidegger
On 02.12.2009 18:33, José Fonseca wrote:
 I've seen strict aliasing assumption causing bugs in other gallium
 components. It seems endemic to our code.  Unless we actively decidee to
 go and chase the strict aliasing bugs now we should add
 -fno-strict-aliasing to all our builds.

Do we ever want to fix strict aliasing? If we do, I think the problem
with refcounting is pretty fundamental (I traced the crash to aliasing
problems there, and hacked up some bogus version which didn't segfault
for the testcase I used). At least I can't see a way to make this really
work in some nice way. Supposedly gcc supports
__attribute__((__may_alias__)) but I had no luck with it.
In gallium (not core mesa) there's only one other offender causing a
large amount of warnings, that is util_pack_color, and I think it won't
actually cause problems.

--
Join us December 9, 2009 for the Red Hat Virtual Experience,
a free event focused on virtualization and cloud computing. 
Attend in-depth sessions from your desk. Your couch. Anywhere.
http://p.sf.net/sfu/redhat-sfdev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] Mesa (mesa_7_7_branch): mesa: Fix array out-of-bounds access by _mesa_TexGeni.

2009-12-01 Thread Roland Scheidegger
On 01.12.2009 11:16, Ian Romanick wrote:
 Speaking of which... there are a bunch of conflicts merging 7.7 to
 master in Galliumland.  Could one of you guys take a look at it?  I have
 no clue what's going on over there.
Quite a few of that was due to the gallium interface changes (introduced
by width0 branch merge). It will get only worse when I merge
gallium-noblocks branch (not quite there yet). Those changes are fairly
intrusive as those API changes affect a lot of files/code.

Roland



--
Join us December 9, 2009 for the Red Hat Virtual Experience,
a free event focused on virtualization and cloud computing. 
Attend in-depth sessions from your desk. Your couch. Anywhere.
http://p.sf.net/sfu/redhat-sfdev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] Mesa (mesa_7_7_branch): mesa: Fix array out-of-bounds access by _mesa_TexGeni.

2009-12-01 Thread Roland Scheidegger
On 01.12.2009 15:35, Keith Whitwell wrote:
 On Tue, 2009-12-01 at 06:31 -0800, Roland Scheidegger wrote:
 On 01.12.2009 11:16, Ian Romanick wrote:
 Speaking of which... there are a bunch of conflicts merging 7.7 to
 master in Galliumland.  Could one of you guys take a look at it?  I have
 no clue what's going on over there.
 Quite a few of that was due to the gallium interface changes (introduced
 by width0 branch merge). It will get only worse when I merge
 gallium-noblocks branch (not quite there yet). Those changes are fairly
 intrusive as those API changes affect a lot of files/code.
 
 They were pretty minimal really - but there was some knowledge required
 of what is new and what is old.  It's not much fun resolving
 conflicts in code you don't know about, but the conflicts themselves
 weren't onerous.

Yes, the changes themselves are pretty simple - it's just because so
many files are affected there's a lot of potential for future merge
conflicts. Nothing really difficult to resolve, but annoying nonetheless
(but there's no way to avoid it).

Roland

--
Join us December 9, 2009 for the Red Hat Virtual Experience,
a free event focused on virtualization and cloud computing. 
Attend in-depth sessions from your desk. Your couch. Anywhere.
http://p.sf.net/sfu/redhat-sfdev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] [PATCH] Add entrypoints for setting vertex texture state

2009-11-27 Thread Roland Scheidegger
On 27.11.2009 19:32, michal wrote:
 Why is the MAX here smaller than for fragment samplers?  Doesn't GL
 require them to be the same, because GL effectively binds the same set
 of sampler states in both cases?  

 Can you take a closer look at what the GL state tracker would have to do
 to expose this functionality and make sure it's valid?

   
 
 It's all good. There is GL_MAX_VERTEX_TEXTURE_UNITS that tells how many 
 samplers can be used in a vertex shader. Anything above that is used 
 only with fragment shaders and ignored for vertex shaders.
I fail to see though why the limit needs to be that low. All modern
hardware nowadays can use the same number of texture samplers for both
fragment and vertex shading (it's the same sampler hardware, after all).
Some older hardware (typically non-unified, D3D9 shader model 3
compliant) though indeed only had limited support for this (like the
GeForce 6/7 series) probably only supporting 4 (can't remember exactly),
though other hardware never implemented it despite d3d9 sm3 requiring it
(thanks to a api loophole).

Roland

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


[Mesa3d-dev] gallium width0 branch merge

2009-11-26 Thread Roland Scheidegger
Hi,

just a warning I'm planning on merging width0 branch to master tomorrow.
This is a interface change eliminating width/height/depth arrays from
pipe_texture, instead just storing base width/height/depth. In-tree
drivers/state trackers should be fixed (I think though there might be
bugs with rbug), but obviously if you have any out-of-tree drivers they
will break (though should be trivial to fix).

Roland

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


[Mesa3d-dev] st_shader-varients merge tomorrow

2009-11-24 Thread Roland Scheidegger
I'm planning to merge st_shader-varients branch to master tomorrow.

This should not adversely affect drivers, unless they rely on generics
inputs/outputs semantic_index always starting at 0 without holes
(something that they shouldn't do but it would have worked previously).
Feedback for hw drivers welcome, I'll try i915 myself, but I can't test
the others, though some quick glance seemed to suggest they should be ok.

Roland

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] Blit support for r300

2009-10-23 Thread Roland Scheidegger
On 23.10.2009 08:37, Maciej Cencora wrote:
 Hi,
 
 as you may already know r300 classic driver is in pretty good shape these 
 days, but there's one thing that causes major slowdowns in many games: lack 
 of 
 hardware accelerated blit operation. 
The same is true for r100/r200...

 Currently all glCopyTex[Sub]Image operations are done through span functions 
 which is slow as hell.
 We could use the hw blitter unit, but using it causes stalls because of the 
 2D/3D mode switch.
A long time ago I implemented this as a hack for r200 (just blit
directly to the texture in vram, so never touching the backup texture in
system memory). Worked quite well in practice (good enough for doom3
special effects). I didn't notice any obvious slowdowns due to 2d/3d
sync issues (though maybe I didn't do any syncs...).

 I was wondering how this could be fixed and I got this crazy idea of porting 
 the everything-is-texture concept from gallium to classic mesa. Actually not 
 all of it, just the pieces that make the renderbuffers look like textures for 
 the driver.
You could probably just try to hack up a blit using 3d engine? Though of
course lots of setup would be needed. Nice thing about not using blitter
 (apart from potential performance issues) is of course that you can
also support format conversion for free.

 
 Brian, what do you think about this idea? Is it feasible and worth doing?
 Maybe you have better ideas how to resolve this issue?
Not sure what Brian's opinion on that is, but I'm not sure if there's
really much point in trying to port over half of gallium to classic
mesa. Looks to me like time might be better spent working on gallium
drivers instead...

Roland

--
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] [PATCH 1/2] mesa: Compact state key for TexEnv program cache

2009-09-02 Thread Roland Scheidegger
Hmm, I'm not actually sure this will always reduce the state key size. I
think the compiler is still allowed to pad the mode_opt struct out to
whatever it likes (maybe #pragma pack(1) can prevent this), even though
maybe gcc does not.
I don't like pragmas too much, but it looks the only way to do this in
some clean c99 way would be to get rid of the mode_opt struct entirely?

Roland

On 02.09.2009 16:23, Brian Paul wrote:
 Unfortunately gcc (version 4.3.2 anyway) warns on this:
 main/texenvprogram.c:87: warning: type of bit-field ‘Source’ is a GCC 
 extension
 main/texenvprogram.c:88: warning: type of bit-field ‘Operand’ is a GCC 
 extension
 
 I'm trying to find a #pragma or something to silence the warning...
 
 -Brian
 
 Keith Whitwell wrote:
 Looks great Chris.

 Keith

 On Wed, 2009-09-02 at 05:11 -0700, Chris Wilson wrote:
 By rearranging the bitfields within the key we can reduce the size
 of the key from 644 to 196 bytes, reducing the cost of both the
 hashing and equality tests.
 ---
  src/mesa/main/texenvprogram.c |7 ---
  1 files changed, 4 insertions(+), 3 deletions(-)

 diff --git a/src/mesa/main/texenvprogram.c b/src/mesa/main/texenvprogram.c
 index 5913957..3851937 100644
 --- a/src/mesa/main/texenvprogram.c
 +++ b/src/mesa/main/texenvprogram.c
 @@ -82,8 +82,8 @@ texenv_doing_secondary_color(GLcontext *ctx)
  #define DISASSEM (MESA_VERBOSE  VERBOSE_DISASSEM)
  
  struct mode_opt {
 -   GLuint Source:4;  /** SRC_x */
 -   GLuint Operand:3; /** OPR_x */
 +   GLubyte Source:4;  /** SRC_x */
 +   GLubyte Operand:3; /** OPR_x */
  };
  
  struct state_key {
 @@ -103,10 +103,11 @@ struct state_key {
  
GLuint NumArgsRGB:3;  /** up to MAX_COMBINER_TERMS */
GLuint ModeRGB:5; /** MODE_x */
 -  struct mode_opt OptRGB[MAX_COMBINER_TERMS];
  
GLuint NumArgsA:3;  /** up to MAX_COMBINER_TERMS */
GLuint ModeA:5; /** MODE_x */
 +
 +  struct mode_opt OptRGB[MAX_COMBINER_TERMS];
struct mode_opt OptA[MAX_COMBINER_TERMS];
 } unit[MAX_TEXTURE_UNITS];
  };

 --
 Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
 trial. Simplify your report design, integration and deployment - and focus 
 on 
 what you do best, core application coding. Discover what's new with 
 Crystal Reports now.  http://p.sf.net/sfu/bobj-july
 ___
 Mesa3d-dev mailing list
 Mesa3d-dev@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
 .

 
 
 --
 Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
 trial. Simplify your report design, integration and deployment - and focus on 
 what you do best, core application coding. Discover what's new with 
 Crystal Reports now.  http://p.sf.net/sfu/bobj-july
 ___
 Mesa3d-dev mailing list
 Mesa3d-dev@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] Merging asm-shader-rework-1 branch today

2009-08-24 Thread Roland Scheidegger
On 23.08.2009 01:50, Ian Romanick wrote:
 Philipp Heise wrote:
 Ian Romanick wrote:
 Roland Scheidegger wrote:

 glprogs/R200_interaction.vp
 GL_PROGRAM_ERROR_STRING_ARB: line 1, char 43: error: syntax error,
 unexpected $undefined
 Okay.  I posted a patch to bug #23457 that should fix this.  Could you
 give it a test on R200, and let me know?  I've only run this on my
 laptop, and I don't have Doom3 installed there.  I haven't yet tested it
 in-game.
 Hi Ian,
 
 thanks for your great work!
 The problem is not the header of the vp, but the additional carriage
 return at the end of each line ... DOS newline-format. Therefore the
 parser fails at the end of the header line. The attached patch should
 fix the problem.
 
 Oh good grief!  It's always the little things.  Hmm... Unix uses \n, and
 DOS uses \r\n.  Don't Macs use \r?  If that's the case, the proposed
 patch could cause the line numbers to be incorrect if the shaders are
 authored on Macs.  I should be able to whip up a patch that will handle
 that case, but it will have to wait until later today.  Thanks for
 tracking this down.
 

Works now perfectly indeed. Sorry for leading you to the wrong track
first, I don't know why the doom3 error output doesn't include the
header of the shader even though it's actually there.

Roland

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] Merging asm-shader-rework-1 branch today

2009-08-24 Thread Roland Scheidegger
On 23.08.2009 01:50, Ian Romanick wrote:
 Philipp Heise wrote:
 Ian Romanick wrote:
 Roland Scheidegger wrote:

 glprogs/R200_interaction.vp
 GL_PROGRAM_ERROR_STRING_ARB: line 1, char 43: error: syntax error,
 unexpected $undefined
 Okay.  I posted a patch to bug #23457 that should fix this.  Could you
 give it a test on R200, and let me know?  I've only run this on my
 laptop, and I don't have Doom3 installed there.  I haven't yet tested it
 in-game.
 Hi Ian,
 
 thanks for your great work!
 The problem is not the header of the vp, but the additional carriage
 return at the end of each line ... DOS newline-format. Therefore the
 parser fails at the end of the header line. The attached patch should
 fix the problem.
 
 Oh good grief!  It's always the little things.  Hmm... Unix uses \n, and
 DOS uses \r\n.  Don't Macs use \r?  If that's the case, the proposed
 patch could cause the line numbers to be incorrect if the shaders are
 authored on Macs.  I should be able to whip up a patch that will handle
 that case, but it will have to wait until later today.  Thanks for
 tracking this down.
 

Works now perfectly indeed. Sorry for leading you to the wrong track
first, I don't know why the doom3 error output doesn't include the
header of the shader even though it's actually there.

Roland

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] Merging asm-shader-rework-1 branch today

2009-08-22 Thread Roland Scheidegger
On 21.08.2009 20:26, Ian Romanick wrote:
 All,
 
 In the next couple hours I'm planning to merge the asm-shader-rework-1
 branch to master.  In my testing I have found that it passes at least as
 many (and in a couple cases more) tests than the current code.  One of
 our internal tests runs about 89,000 vertex programs.  This test takes
 about 30 minutes (1,800 seconds) on current Mesa master.  On the new
 code it takes about 25 seconds.
Good work!

It seems to break (all of) doom3's (vertex, at least) shaders however.
At least with r200, here's the doom3 output for the main r200 vertex
shader (others break in exactly the same way).

glprogs/R200_interaction.vp
GL_PROGRAM_ERROR_STRING_ARB: line 1, char 43: error: syntax error,
unexpected $undefined

error at 34:
ariant ;

# this is slightly simpler than the ARB interaction,
# because the R200 can only emit six texture coordinates,
# so we assume that the diffuse and specular matrixes are
# the same, with higher level code splitting it into two
# passes if it isn't
#
# I am using texcoords instead of attribs, because a separate
# extension is required to use attribs with vertex array objects.
#
# input:
#
# TEX0  texture coordinates
# TEX1  tangent[0]
# TEX2  tangent[1]
# TEX3  normal
# COL   vertex color
#
# c[4]  localLightOrigin
# c[5]  localViewOrigin
# c[6]  lightProjection S
# c[7]  lightProjection T
# c[8]  lightProjection Q
# c[9]  lightFalloff S
# c[10] bumpMatrix S
# c[11] bumpMatrix T
# c[12] diffuseMatrix S
# c[13] diffuseMatrix T
# c[14] specularMatrix S
# c[15] specularMatrix T
#
# output:
#
# texcoord 0 = light projection texGen
# texcoord 1 = light falloff texGen
# texcoord 2 = bumpmap texCoords
# texcoord 3 = specular / diffuse texCoords
# texcoord 4 = normalized halfangle vector in tangent space
# texcoord 5 = unnormalized vector to light in tangent space

TEMPR0, R1, R2, lightDir;

PARAM   defaultTexCoord = { 0, 0.5, 0, 1 };

# texture 0 has three texgens
DP4 result.texcoord[0].x, vertex.position, program.env[6];
DP4 result.texcoord[0].y, vertex.position, program.env[7];
DP4 result.texcoord[0].w, vertex.position, program.env[8];

# texture 1 has one texgen
MOV result.texcoord[1], defaultTexCoord;
DP4 result.texcoord[1].x, vertex.position, program.env[9];

# textures 2 takes the base coordinates by the texture matrix
MOV result.texcoord[2], defaultTexCoord;
DP4 result.texcoord[2].x, vertex.texcoord[0], program.env[10];
DP4 result.texcoord[2].y, vertex.texcoord[0], program.env[11];

# textures 3 takes the base coordinates by the texture matrix
MOV result.texcoord[3], defaultTexCoord;
DP4 result.texcoord[3].x, vertex.texcoord[0], program.env[12];
DP4 result.texcoord[3].y, vertex.texcoord[0], program.env[13];

# texture 4's texcoords will be the halfangle in tangent space

# calculate normalized vector to light in R0
SUB lightDir, program.env[4], vertex.position;
DP3 R1, lightDir, lightDir;
RSQ R1, R1.x;
MUL R0, lightDir, R1.x;

# calculate normalized vector to viewer in R1
SUB R1, program.env[5], vertex.position;
DP3 R2, R1, R1;
RSQ R2, R2.x;
MUL R1, R1, R2.x;

# add together to become the half angle vector in object space
(non-normalized)
ADD R0, R0, R1;

# put into texture space
DP3 result.texcoord[4].x, vertex.texcoord[1], R0;
DP3 result.texcoord[4].y, vertex.texcoord[2], R0;
DP3 result.texcoord[4].z, vertex.texcoord[3], R0;

# texture 5's texcoords will be the unnormalized lightDir in tangent space
DP3 result.texcoord[5].x, vertex.texcoord[1], lightDir;
DP3 result.texcoord[5].y, vertex.texcoord[2], lightDir;
DP3 result.texcoord[5].z, vertex.texcoord[3], lightDir;

# generate the vertex color, which can be 1.0, color, or 1.0 - color
# for 1.0 : env[16] = 0, env[17] = 1
# for color : env[16] = 1, env[17] = 0
# for 1.0-color : env[16] = -1, env[17] = 1
MAD result.color, vertex.color, program.env[16],
program.env[17];

END

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] Mesa (master): i965: Use _MaxElement instead of index-calculated min/ max for VBO bounds.

2009-08-14 Thread Roland Scheidegger
On 13.08.2009 12:19, Michel Dänzer wrote:
 On Wed, 2009-08-12 at 11:31 -0700, Eric Anholt wrote:
 Module: Mesa Branch: master Commit:
 e643bc5fc7afb563028f5a089ca5e38172af41a8 URL:
 http://cgit.freedesktop.org/mesa/mesa/commit/?id=e643bc5fc7afb563028f5a089ca5e38172af41a8
 
 
 Author: Eric Anholt e...@anholt.net Date:   Tue Aug 11 12:59:09
 2009 -0700
 
 i965: Use _MaxElement instead of index-calculated min/max for VBO
 bounds.
 
 This change breaks things all over the place here. E.g.
 progs/glsl/array and .../skinning are missing most of the geometry,
 and a lot of the other glsl progs have weird lighting.
The problem here is for vertex buffer elements which have zero stride
count is 1. This is now used as bounds check, which will hence fail for
most indices - according to docs hardware will simply return 0 in this
case (and not the data at start index or something like that which would
work). I'll fix this (should be ok to just disable bounds check for
these cases as in fact any index is valid if stride is 0). (Looks like
this isn't an issue for IGDNG (Ironlake I assume) since it appears like
this one uses address range instead of maximum index to check against
according to the code...).

Roland

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] ATI Mobility Radeon X300: Blender menus all black (or white)

2009-07-31 Thread Roland Scheidegger
On 31.07.2009 10:26, Terry Barnaby wrote:
 Hi,I have a problem with the Mesa DRI Radeon 300 driver in that I cannot use 
 the 
 blender application as the menus are not displayed correctly. See bug:
 https://bugs.freedesktop.org/show_bug.cgi?id=21774
 
 I would like to get this fixed as I need to be able to run blender at a 
 reasonable speed on this system. I am running the latest 
 DRM/MESA/xf86-video-ati 
 code from git. Are there any pointers on how to start debugging this ?
 For example, can I turn off various hardware acceleration features one by one
 until I find the source of the problem ?

This looks like an issue with dri2 and front buffer rendering. Did you
try if this still works with dri1? You probably can't just disable dri2,
but disabling kms (nomodeset boot param) should force the driver to use
dri1 (for radeon cards) I think.

Roland

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] ATI Mobility Radeon X300: Blender menus all black (or white)

2009-07-31 Thread Roland Scheidegger
On 31.07.2009 15:35, Terry Barnaby wrote:
 On 07/31/2009 02:15 PM, Roland Scheidegger wrote:
 On 31.07.2009 10:26, Terry Barnaby wrote:
 Hi,I have a problem with the Mesa DRI Radeon 300 driver in that I cannot 
 use the
 blender application as the menus are not displayed correctly. See bug:
 https://bugs.freedesktop.org/show_bug.cgi?id=21774

 I would like to get this fixed as I need to be able to run blender at a
 reasonable speed on this system. I am running the latest 
 DRM/MESA/xf86-video-ati
 code from git. Are there any pointers on how to start debugging this ?
 For example, can I turn off various hardware acceleration features one by 
 one
 until I find the source of the problem ?
 This looks like an issue with dri2 and front buffer rendering. Did you
 try if this still works with dri1? You probably can't just disable dri2,
 but disabling kms (nomodeset boot param) should force the driver to use
 dri1 (for radeon cards) I think.

 Roland
 Thanks for the reply.
 I have tried a lot of user level configuration options such as nomodeset,
 disabling EXA etc. None of these had any effect on the blender issue.
 Using nomodeset on the kernel boot line did appear to change the DRI 
 interface 
 from DRI2 to DRI1 (at least from the glxinfo report).
 
 Hence my move to trying the latest code from git ...

Oh so it also happens with DRI1. It would probably be easiest (as others
have suggested) to do a git bisect then. Alternatively, since this
appears to work on r200 but not r300, you could try finding differences
wrt front buffer rendering between those drivers manually - it seems
unlikely this still happens to work with git master on r200 but not r300
since pretty much all of the code which could affect this issue should
be shared.

Roland

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] ATI R200 code currently broken in git

2009-07-31 Thread Roland Scheidegger
On 31.07.2009 17:36, Terry Barnaby wrote:
 I have just compiled/installed the latest drm/mesa/xf86-video-ati code from 
 git 
 under Fedora 11 on a system with an ATI Technologies Inc RV280 [Radeon 9200 
 PRO]
 graphics board.
 
 2D appears fine. 3D is quite broken.
 
 gxlgears runs showing about 500 frames/sec. However it shows as moving gears
 for 1 second followed by 5 seconds of still frame, repeated.
 
 blender just shows a really currupted screen with chess board patterns,
 lots of horizontal tearing, half drawn menu's etc ...

Hmm, I don't see that. I'm using a very old drm/ddx on that box though
so no kms/dri2, and no compositing neither.

Roland


--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


  1   2   3   >