from:"Roland Scheidegger"

On 07.03.2010 01:21, José Fonseca wrote:
 On Sat, 2010-03-06 at 05:44 -0800, Brian Paul wrote:
 On Sat, Mar 6, 2010 at 5:44 AM, José Fonseca jfons...@vmware.com wrote:
 On Mon, 2010-03-01 at 09:03 -0800, Michel Dänzer wrote:
 On Fri, 2010-02-26 at 08:47 -0800, Jose Fonseca wrote:
 Module: Mesa
 Branch: master
 Commit: 9beb302212a2afac408016cbd7b93c8b859e4910
 URL:
 http://cgit.freedesktop.org/mesa/mesa/commit/?id=9beb302212a2afac408016cbd7b93c8b859e4910

 Author: José Fonseca jfons...@vmware.com
 Date:   Fri Feb 26 16:45:22 2010 +

 util: Code generate functions to pack and unpack a single pixel.

 Should work correctly for all pixel formats except SRGB formats.

 Generated code made much simpler by defining the pixel format as
 a C structure. For example this is the generated structure for
 PIPE_FORMAT_B6UG5SR5S_NORM:

 union util_format_b6ug5sr5s_norm {
uint16_t value;
struct {
   int r:5;
   int g:5;
   unsigned b:6;
} chan;
 };
 José, are you aware that the memory layout of bitfields is mostly
 implementation dependent? IME this makes them mostly unusable for
 modelling hardware in a portable manner.
 It's not only implementation dependent and slow -- it is also buggy!

 gcc-4.4.3 is doing something very fishy to single bit fields.

 See the attached code. ff ff ff ff is expected, but ff ff ff 01 is
 printed with gcc-4.4.3. Even without any optimization. gcc-4.3.4 works
 fine.

 Am I missing something or is this effectively a bug?
 Same result with gcc 4.4.1.

 If pixel.chan.a is put into a temporary int var followed by the
 scaling arithmetic it comes out as expected.  Looks like a bug to me.
 
 Thanks. I'll submit a bug report then.
 
 BTW, it looks like sizeof(union util_format_b5g5r5a1_unorm) == 4, not 2.
 
 Yet another reason to stay away from bit fields..

Hmm, might be because the bitfields are of type unsigned, not uint16_t?

I've no idea though neither why it would return 01 and not ff.

Roland

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] dri-extension branch - clean up advertising extensions in Gallium

On 07.03.2010 20:26, Marek Olšák wrote:
  This branch is aimed to address the following issues:
 * Extensions are advertised in both st/mesa and st/dri, doing the same
 thing in two places.
 * The inability to disable extensions in pipe_screen::get_param because
 st/dri overrides the decisions of st/mesa.
 
 Here's the branch:
 http://cgit.freedesktop.org/~mareko/mesa/log/?h=dri-extensions
 
 The first commit moves the differences between st/dri and st/mesa to the
 latter and removes dri_init_extensions from st/dri. It doesn't remove
 any extensions from the list except for those not advertised by pipe_screen.
 
 The second commit enables texture_rectangle by default in Gallium. To my
 knowledge any Gallium hardware can do this and I suspect it was
 dependent on NPOT textures by accident.
 
 All this is of course tested with piglit and glean.
 
 Please review. In case it's not OK, please let me know what needs to be
 done.

The second commit looks fine to me.
The first one, I'm not sure. Maybe that's ok, but if so I'm wondering
why, since this skips all the mapping business driInitExtensions did and
just sets the extension enable bits to true. At least I'm fairly sure it
was needed in the past...

Roland

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] dri-extension branch - clean up advertising extensions in Gallium

On 08.03.2010 14:22, Joakim Sindholt wrote:
On Mon, 2010-03-08 at 13:16 +0100, Roland Scheidegger wrote:
On 07.03.2010 20:26, Marek Olšák wrote:
This branch is aimed to address the following issues:
* Extensions are advertised in both st/mesa and st/dri, doing the same
thing in two places.
* The inability to disable extensions in pipe_screen::get_param because
st/dri overrides the decisions of st/mesa.

Here's the branch:
http://cgit.freedesktop.org/~mareko/mesa/log/?h=dri-extensions

The first commit moves the differences between st/dri and st/mesa to the
latter and removes dri_init_extensions from st/dri. It doesn't remove
any extensions from the list except for those not advertised by pipe_screen.

The second commit enables texture_rectangle by default in Gallium. To my
knowledge any Gallium hardware can do this and I suspect it was
dependent on NPOT textures by accident.

All this is of course tested with piglit and glean.

Please review. In case it's not OK, please let me know what needs to be
done.
The second commit looks fine to me.
The first one, I'm not sure. Maybe that's ok, but if so I'm wondering
why, since this skips all the mapping business driInitExtensions did and
just sets the extension enable bits to true. At least I'm fairly sure it
was needed in the past...

Roland

I believe airlied pointed out earlier that
http://cgit.freedesktop.org/mesa/mesa/commit/?id=17ef1f6074d6107c167f1956a5c60993904c0b72
fixed that problem.

But even with that commit, all drivers still call driInitExtensions at
least once, though the parameter list can be NULL. I don't see that
happening here.

Roland

Re: [Mesa3d-dev] dri-extension branch - clean up advertising extensions in Gallium

Otherwise, looks good to me, but I'd prefer if someone more familiar
with the extension handling code could give it a look.

Roland

On 08.03.2010 17:03, Marek Olšák wrote:
Alright, I will add driInitExtensions(ctx, NULL, TRUE) at the end of
st_init_extensions. Anything else I missed or is it OK?

-Marek

On Mon, Mar 8, 2010 at 4:25 PM, Roland Scheidegger srol...@vmware.com
mailto:srol...@vmware.com wrote:

On 08.03.2010 14:22, Joakim Sindholt wrote:
On Mon, 2010-03-08 at 13:16 +0100, Roland Scheidegger wrote:
On 07.03.2010 20:26, Marek Olšák wrote:
This branch is aimed to address the following issues:
* Extensions are advertised in both st/mesa and st/dri, doing
the same
thing in two places.
* The inability to disable extensions in pipe_screen::get_param
because
st/dri overrides the decisions of st/mesa.

Here's the branch:
http://cgit.freedesktop.org/~mareko/mesa/log/?h=dri-extensions
http://cgit.freedesktop.org/%7Emareko/mesa/log/?h=dri-extensions

The first commit moves the differences between st/dri and
st/mesa to the
latter and removes dri_init_extensions from st/dri. It doesn't
remove
any extensions from the list except for those not advertised by
pipe_screen.

The second commit enables texture_rectangle by default in
Gallium. To my
knowledge any Gallium hardware can do this and I suspect it was
dependent on NPOT textures by accident.

All this is of course tested with piglit and glean.

Please review. In case it's not OK, please let me know what
needs to be
done.
The second commit looks fine to me.
The first one, I'm not sure. Maybe that's ok, but if so I'm wondering
why, since this skips all the mapping business driInitExtensions
did and
just sets the extension enable bits to true. At least I'm fairly
sure it
was needed in the past...

Roland

I believe airlied pointed out earlier that

http://cgit.freedesktop.org/mesa/mesa/commit/?id=17ef1f6074d6107c167f1956a5c60993904c0b72
fixed that problem.

But even with that commit, all drivers still call driInitExtensions at
least once, though the parameter list can be NULL. I don't see that
happening here.

Roland

Re: [Mesa3d-dev] dri-extension branch - clean up advertising extensions in Gallium

Well I guess another solution would be to just call it directly from the
place the dri_extennsion code initially was, i.e. in dri_create_context.

Roland

On 08.03.2010 17:21, Jakob Bornecrantz wrote:
Calling dri code from src/mesa/state_tracker is not allowed since it
supposed to be independent of windowing systems. That said from what I
can see both driInitExtensions and driInitSignleExtension could be
folded into mesa core, I can't see anything dri special about them.

Cheers Jakob.

On 8 mar 2010, at 16.12, Roland Scheidegger wrote:
Otherwise, looks good to me, but I'd prefer if someone more familiar
with the extension handling code could give it a look.

Roland

On 08.03.2010 17:03, Marek Olšák wrote:
Alright, I will add driInitExtensions(ctx, NULL, TRUE) at the end of
st_init_extensions. Anything else I missed or is it OK?

-Marek

On Mon, Mar 8, 2010 at 4:25 PM, Roland Scheidegger
srol...@vmware.com
mailto:srol...@vmware.com wrote:

On 08.03.2010 14:22, Joakim Sindholt wrote:
On Mon, 2010-03-08 at 13:16 +0100, Roland Scheidegger wrote:
On 07.03.2010 20:26, Marek Olšák wrote:
This branch is aimed to address the following issues:
* Extensions are advertised in both st/mesa and st/dri, doing
the same
thing in two places.
* The inability to disable extensions in pipe_screen::get_param
because
st/dri overrides the decisions of st/mesa.

Here's the branch:
http://cgit.freedesktop.org/~mareko/mesa/log/?h=dri-extensions
http://cgit.freedesktop.org/%7Emareko/mesa/log/?h=dri-extensions
The first commit moves the differences between st/dri and
st/mesa to the
latter and removes dri_init_extensions from st/dri. It doesn't
remove
any extensions from the list except for those not advertised by
pipe_screen.
The second commit enables texture_rectangle by default in
Gallium. To my
knowledge any Gallium hardware can do this and I suspect it was
dependent on NPOT textures by accident.

All this is of course tested with piglit and glean.

Please review. In case it's not OK, please let me know what
needs to be
done.
The second commit looks fine to me.
The first one, I'm not sure. Maybe that's ok, but if so I'm
wondering
why, since this skips all the mapping business driInitExtensions
did and
just sets the extension enable bits to true. At least I'm fairly
sure it
was needed in the past...

Roland
I believe airlied pointed out earlier that

http://cgit.freedesktop.org/mesa/mesa/commit/?id=17ef1f6074d6107c167f1956a5c60993904c0b72
fixed that problem.

But even with that commit, all drivers still call
driInitExtensions at
least once, though the parameter list can be NULL. I don't see
that
happening here.

Roland

Re: [Mesa3d-dev] RFC: gallium-format-cleanup branch (was Gallium format swizzles)

2010-03-03 Thread Roland Scheidegger

On 03.03.2010 14:07, José Fonseca wrote:
 On Wed, 2010-03-03 at 04:27 -0800, Luca Barbieri wrote:
 PIPE_FORMAT_X8B8G8R8_UNORM is being used by mesa.
 PIPE_FORMAT_R8G8B8X8_UNORM doesn't exist hence it appears to be
 unnecessary. So it doesn't make sense to rename.
 How about D3DFMT_X8B8G8R8? That should map to
 PIPE_FORMAT_R8G8B8X8_UNORM.
 
 Yes, you're right.
 
 BTW, we are also missing D3DFMT_X4R4G4B4, D3DFMT_X1R5G5B5, 
 D3DFMT_A4L4, D3DFMT_A1, D3DFMT_L6V5U5, D3DFMT_D15S1,
 D3DFMT_D24X4S4, D3DFMT_CxV8U8 and perhaps others I did not notice.
 
 D3DFMT_L6V5U5 is there (PIPE_FORMAT_R5SG5SB6U_NORM). The others are 
 indeed missing. Neither of the mentioned formats is required for D3D9
  conformance, but we could add them to gallium.
 
 D3DFMT_A1 is special: it has less than 1 byte per pixel. Probably the
  best way to support it would be to treat it as a 8x1 macro pixel,
 8bits, similarly to compressed formats.
 
 D3DFMT_CxV8U8 too as special semantics.

And not only are those formats optional, some would be completely
pointless in gallium (D15S1, D24X4S4). There's simply no modern hardware
which supports 1 bit stencil (I think pretty much the only chip
supporting that was savage3d), nor 4 bit stencil (can't remember
off-hand any chip supporting that, maybe some of the then professional
chips did). The others sound a bit more plausible and hardware may
support them, but I'm not sure they are really missed (A4L4, X4R4G4B4,
X1R5G5B5). As José said, CxV8U8 isn't really a format only, and we'll
need to add 1-bit format for DX10.

Roland

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] Does DX9 SM3 - VMware svga with arbitrary semantics work? How?

2010-03-03 Thread Roland Scheidegger

On 03.03.2010 20:23, Luca Barbieri wrote:
 And never will...  It does not export PIPE_CAP_GLSL, and does not have
 the shader opcodes to ever do so.
 
 Any Gallium driver should be able to support the GLSL subset without
 control flow.
 
 And if we had a proper optimization infrastructure capable of inlining
 functions, converting conditionals to multiplications and unrolling
 loops (e.g. look at what the nVidia Cg compiler does), then
 essentially all GLSL could be supported on any driver, with only
 limitations on the maximum number of loop iterations.
 
 Isn't it worth supporting that?
 
 BTW, proprietary drivers do this: for instance nVidia supports GLSL on
 nv30, which can't do control flow in fragment shaders and doesn't
 support SM3.

I think the i915 is a lot closer to r300 in that regard (which is quite
a bit more limited than nv30), and it's true that ATI also supported
glsl on that. As far as I know though it was quite easy to bump into
shaders which wouldn't compile. There's only so much you can do if you
have 4 blocks of (max) 16 instructions to run without any control flow
if you need to unroll loops, not to mention lacking instructions for
derivatives, or the fact things like sin/cos will take quite a few
instructions...
nv30, while processing fragment shaders slowly, had a LOT higher
instruction count, IIRC supported derivatives and predication and had no
dependent texturing limit. So that makes it a lot better suited for glsl
hacks.
So, I'm not sure it really makes a whole lot of sense to support glsl on
i915. It'll really only ever work for very simple things (granted there
are apps out there which indeed will only use glsl shaders which are
known to compile fine on r300...)

Roland

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] [RFC] gallium-vertexelementcso branch merge

2010-03-02 Thread Roland Scheidegger

On 02.03.2010 11:37, Keith Whitwell wrote:
 On Mon, 2010-03-01 at 10:02 -0800, Roland Scheidegger wrote:
 Hi,

 this branch turns vertex element into a cso, so instead of
 set_vertex_elements there's now the triad of
 create/bind/delete_vertex_elements_state. I have converted all the
 drivers except nouveau (I didn't do it because Christoph Bumiller
 already did nv50, but I can give the rest of them a shot), though that
 doesn't necessarily mean they are optimized for it (the idea is of
 course to precalculate state on create, not just copy the pipe structs
 and do everything on bind) - only i965g really does something close to
 it (though still emits the state always). Drivers doing both hw vertex
 shaders and using draw in some circumstances of course will have to
 store both representations on create.
 Also note that util_draw_vertex_buffer semantics have changed a bit
 (caller needs to set vertex element state, which is a bit odd).
 
 Roland,
 
 The branch looks good to me, happy to see it merged when you're ready to
 go.

There's actually something in the cso code I was a bit unsure about,
I've looked at it again and indeed it seems wrong. The problem is that
the count value itself isn't stored for the comparison. So in the
unlikely case the hash value is the same for pipe_vertex_elements with
different counts, the comparison itself will also be the same as long as
the first few elements are identical. Which seems very wrong. The
easiest way to fix would probably be to just store the count alongside
the pipe_vertex_element data, but that would need an additional copy of
the incoming data in cso_set_vertex_elements. Hmm...

Roland

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

[Mesa3d-dev] [RFC] gallium-vertexelementcso branch merge

2010-03-01 Thread Roland Scheidegger

Hi,

this branch turns vertex element into a cso, so instead of
set_vertex_elements there's now the triad of
create/bind/delete_vertex_elements_state. I have converted all the
drivers except nouveau (I didn't do it because Christoph Bumiller
already did nv50, but I can give the rest of them a shot), though that
doesn't necessarily mean they are optimized for it (the idea is of
course to precalculate state on create, not just copy the pipe structs
and do everything on bind) - only i965g really does something close to
it (though still emits the state always). Drivers doing both hw vertex
shaders and using draw in some circumstances of course will have to
store both representations on create.
Also note that util_draw_vertex_buffer semantics have changed a bit
(caller needs to set vertex element state, which is a bit odd).

Roland

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] [RFC] gallium-vertexelementcso branch merge

2010-03-01 Thread Roland Scheidegger

On 01.03.2010 19:02, Roland Scheidegger wrote:
 Hi,
 
 this branch turns vertex element into a cso, so instead of
 set_vertex_elements there's now the triad of
 create/bind/delete_vertex_elements_state. I have converted all the
 drivers except nouveau (I didn't do it because Christoph Bumiller
 already did nv50, but I can give the rest of them a shot), though that
 doesn't necessarily mean they are optimized for it (the idea is of
 course to precalculate state on create, not just copy the pipe structs
 and do everything on bind) - only i965g really does something close to
 it (though still emits the state always). Drivers doing both hw vertex
 shaders and using draw in some circumstances of course will have to
 store both representations on create.
 Also note that util_draw_vertex_buffer semantics have changed a bit
 (caller needs to set vertex element state, which is a bit odd).

Ok, I've converted nv30/nv40 too. Not that they'd precalculate any hw
state...

Roland

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] [RFC] gallium-vertexelementcso branch merge

2010-03-01 Thread Roland Scheidegger

On 02.03.2010 00:18, Joakim Sindholt wrote:
 On Mon, 2010-03-01 at 19:02 +0100, Roland Scheidegger wrote:
 Hi,

 this branch turns vertex element into a cso, so instead of
 set_vertex_elements there's now the triad of
 create/bind/delete_vertex_elements_state. I have converted all the
 drivers except nouveau (I didn't do it because Christoph Bumiller
 already did nv50, but I can give the rest of them a shot), though that
 doesn't necessarily mean they are optimized for it (the idea is of
 course to precalculate state on create, not just copy the pipe structs
 and do everything on bind) - only i965g really does something close to
 it (though still emits the state always). Drivers doing both hw vertex
 shaders and using draw in some circumstances of course will have to
 store both representations on create.
 Also note that util_draw_vertex_buffer semantics have changed a bit
 (caller needs to set vertex element state, which is a bit odd).

 Roland
 
 Can I still do things like:
 element 0: - vbo 5
 element 1: - vbo 2
 and then set_vertex_buffers() with an array { zeros, zeros, vbo 2, zeros
 zeros, vbo 5 } ?
 

The branch doesn't change pipe_vertex_element itself (except
nr_components got removed as that's really derived from the associated
pipe_format), only how those vertex elements are set. Hence you can do
exactly the same things you could do before. Though I'm not quite sure
what your zeros mean, if that's just a unused vbo it should be ok, but
it is probably not ok to just pass in a null pointer for a unused
pipe_vertex_buffer.

Roland

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] move normalized texel coordinates bit to sampler view

2010-02-25 Thread Roland Scheidegger

On 25.02.2010 18:39, michal wrote:
 Roland Scheidegger wrote on 2010-02-24 15:18:
 On 24.02.2010 12:48, Christoph Bumiller wrote:
   
 This wasn't a problem before because textures and samplers were
 linked 1:1, but in view of the gallium-gpu4-texture-opcodes branch,
 this coordinate normalization bit becomes a problem.

 NV50 hardware has that bit in the RESOURCE binding, and not the
 SAMPLER binding, and you can imagine that this will lead to us having
 to jump through a few annoying looking hoops to accomodate.

 As far as I can see, neither D3D10 nor D3D11 nor OpenGL nor CUDA have
 sampler states that are decoupled from the texture, and which contain
 a normalized coordinates bit, so it's worth considering not having it there
 in gallium either.

 For OpenGL, unnormalized coordinates are only used for RECT textures,
 and in this case it makes sense to make it a property of the texture.
 
 I agree this is not sampler state, but I don't quite agree this should
 be texture state.
 This changes how texture coordinates get interpreted in the interpolator
 - in that sense it is similar to the cylindrical texture coord wrap
 which we moved away from sampler state recently. This one got moved to
 shader declaration. I wonder if the normalization bit should be treated
 the same.
 Though OTOH you're quite right that in OpenGL this really is texture
 property (it is a different texture target after all), and afaik d3d
 doesn't support non-normalized coords (?). Hmm...

   
 Isn't it the case that for RECT targets we clear the bit, and for others 
 we always set it?
 
 In mesa st I see:
 
  if (texobj-Target != GL_TEXTURE_RECTANGLE_ARB)
 sampler-normalized_coords = 1;
 
 By definition, RECT texture with normalised coordinates is just an NPOT. 
 If we removed this apparently redundant flag, would that make nouveau 
 developers life easier?
But we don't have rect targets in gallium hence we need the flag. I
think conceptually this makes sense since for texture layouts etc.
drivers won't care one bit if this is 2d npot or rect texture.
Though I guess introducing rect targets instead would be another option.

Roland


 
   
 And, finally, I've seen you reverted the changes for independent image
 and sampler index in the texture opcodes. What's up with that ?
 Is the code not nice enough, or has the idea been discarded and by problem
 disappears ?

 
 
 Please consider this branch dead. It will be easier for me to introduce 
 new, optional sampler and fetch opcodes a'la GL 3.0. There's just too 
 much code to fix and test and we still want the older hardware not to 
 stand on its head to try and translate back to old model.
 
 Thanks.


--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] move normalized texel coordinates bit to sampler view

2010-02-24 Thread Roland Scheidegger

On 24.02.2010 12:48, Christoph Bumiller wrote:
 This wasn't a problem before because textures and samplers were
 linked 1:1, but in view of the gallium-gpu4-texture-opcodes branch,
 this coordinate normalization bit becomes a problem.
 
 NV50 hardware has that bit in the RESOURCE binding, and not the
 SAMPLER binding, and you can imagine that this will lead to us having
 to jump through a few annoying looking hoops to accomodate.
 
 As far as I can see, neither D3D10 nor D3D11 nor OpenGL nor CUDA have
 sampler states that are decoupled from the texture, and which contain
 a normalized coordinates bit, so it's worth considering not having it there
 in gallium either.
 
 For OpenGL, unnormalized coordinates are only used for RECT textures,
 and in this case it makes sense to make it a property of the texture.

I agree this is not sampler state, but I don't quite agree this should
be texture state.
This changes how texture coordinates get interpreted in the interpolator
- in that sense it is similar to the cylindrical texture coord wrap
which we moved away from sampler state recently. This one got moved to
shader declaration. I wonder if the normalization bit should be treated
the same.
Though OTOH you're quite right that in OpenGL this really is texture
property (it is a different texture target after all), and afaik d3d
doesn't support non-normalized coords (?). Hmm...

Roland



 
 And, finally, I've seen you reverted the changes for independent image
 and sampler index in the texture opcodes. What's up with that ?
 Is the code not nice enough, or has the idea been discarded and by problem
 disappears ?
 
 Best regards,
 Christoph
 
 
 --
 Download Intel#174; Parallel Studio Eval
 Try the new software tools for yourself. Speed compiling, find bugs
 proactively, and fine-tune applications for parallel performance.
 See why Intel Parallel Studio got high marks during beta.
 http://p.sf.net/sfu/intel-sw-dev
 ___
 Mesa3d-dev mailing list
 Mesa3d-dev@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] [PATCH] st/dri: don't enable EXT_draw_buffers2 by default

2010-02-22 Thread Roland Scheidegger

Marek,

I don't particularly like that patch, because it doesn't really fix the
problem with the extension handling.
There are lots of extension listed there which should not be advertized
by default, so picking one out won't fix the others.
I think they are there because driInitExtensions definitely does more
than just set ctx-Extensions_foo_bar to enable.
Other extensions in this list which are queried by CAP bits but still
show up in the extension string regardless are the glsl ones
(ARB_fragment_shader and friends), a couple texture address modes
(mirrored_repeat, mirror_clamp), blend_equation_separate, technically
even ARB_multitexture (though we probably should skip the test for more
than 1 texture unit and always set that to true in st_extensions.c),
two-sided stencil, occlusion queries, anisotropic filtering, ycbcr
textures, packed depth stencil (there may be more that was just from a
quick look).
So if it's ok to remove them all from that list this should be done, but
I fear it's not ok and the fix needs to be a bit more complicated (see
comments in dri_init_extensions).

Roland







On 21.02.2010 16:00, Marek Olšák wrote:
  Hi,
 
 the attached patch modifies st/dri to not enable EXT_draw_buffers2 by
 default because r300g and most probably even some other drivers can't
 support this extension. The drivers reporting support of
 PIPE_CAP_INDEP_BLEND_ENABLE are not affected by this patch.
 
 Please review.
 
 Marek
 
 
 
 
 From ddda2c19b74780263f848ffafe10809bd6385d01 Mon Sep 17 00:00:00 2001
 From: =?utf-8?q?Marek=20Ol=C5=A1=C3=A1k?= mar...@gmail.com
 Date: Sun, 21 Feb 2010 01:27:09 +0100
 Subject: [PATCH 2/2] st/dri: don't enable EXT_draw_buffers2 by default
 
 ---
  src/gallium/state_trackers/dri/dri_extensions.c |1 -
  1 files changed, 0 insertions(+), 1 deletions(-)
 
 diff --git a/src/gallium/state_trackers/dri/dri_extensions.c 
 b/src/gallium/state_trackers/dri/dri_extensions.c
 index 1259813..7f8ceef 100644
 --- a/src/gallium/state_trackers/dri/dri_extensions.c
 +++ b/src/gallium/state_trackers/dri/dri_extensions.c
 @@ -99,7 +99,6 @@ static const struct dri_extension card_extensions[] = {
 {GL_EXT_blend_minmax, GL_EXT_blend_minmax_functions},
 {GL_EXT_blend_subtract, NULL},
 {GL_EXT_cull_vertex, GL_EXT_cull_vertex_functions},
 -   {GL_EXT_draw_buffers2, GL_EXT_draw_buffers2_functions},
 {GL_EXT_fog_coord, GL_EXT_fog_coord_functions},
 {GL_EXT_framebuffer_object, GL_EXT_framebuffer_object_functions},
 {GL_EXT_multi_draw_arrays, GL_EXT_multi_draw_arrays_functions},
 
 
 
 
 --
 Download Intel#174; Parallel Studio Eval
 Try the new software tools for yourself. Speed compiling, find bugs
 proactively, and fine-tune applications for parallel performance.
 See why Intel Parallel Studio got high marks during beta.
 http://p.sf.net/sfu/intel-sw-dev
 
 
 
 
 ___
 Mesa3d-dev mailing list
 Mesa3d-dev@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] Mesa (master): r300g: remove L8_UNORM from colorbuffer formats

2010-02-19 Thread Roland Scheidegger

This isn't actually true any more. See issue (9) of
ARB_framebuffer_object which defines luminance, luminance_alpha and
intensity as renderable.
(I'm not quite sure how color assignment is done, readpixels and the
like would define L = R + G + B, but I think it will follow the table
from texture image specification instead, hence L = R, I = R).
You are quite right though this is a recent addition, and in fact for
instance i965 can't render to this neither (it can render to red or
alpha formats, but none of the l/i formats) directly, neither can r300
(without shader hacking).

Roland

On 19.02.2010 15:35, Marek Olšák wrote:
I still think st/xorg should use R8, which is well defined as to which
component to store, rather than L8. That's also the reason L8 is not
renderable in OpenGL.

2010/2/19 Corbin Simpson mostawesomed...@gmail.com
mailto:mostawesomed...@gmail.com

Yeah, I would have nak'd this. Will revert when I get home.

Posting from a mobile, pardon my terseness. ~ C.

On Feb 19, 2010 12:56 AM, Michel Dänzer mic...@daenzer.net
mailto:mic...@daenzer.net wrote:

On Thu, 2010-02-18 at 19:24 -0800, Marek Olk wrote:
Module: Mesa
Branch: master
Commit: fc427d23439a2702068209957f08990ea29fe21b
URL:

http://cgit.freedesktop.org/mesa/mesa/commit/?id=fc427d23439a2702068209957f08990ea29fe21b

Author: Marek Olšák mar...@gmail.com mailto:mar...@gmail.com
Date: Fri Feb 19 04:23:06 2010 +0100

r300g: remove L8_UNORM from colorbuffer formats

Not renderable in OpenGL anyway.

The Xorg state tracker uses it though.

--
Earthling Michel Dänzer |
http://www.vmware.com
Libre software enthusiast | Debian, X and DRI
developer

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
mailto:Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] Mesa (master): util: Fix descriptors for R32_FLOAT and R32G32_FLOAT formats .

2010-02-12 Thread Roland Scheidegger

On 12.02.2010 14:44, michal wrote:
Keith Whitwell wrote on 2010-02-12 14:28:
On Fri, 2010-02-12 at 05:09 -0800, michal wrote:

Keith Whitwell wrote on 2010-02-12 13:39:

On Fri, 2010-02-12 at 04:32 -0800, Micha?? Kr??l wrote:

Module: Mesa
Branch: master
Commit: aa0b671422880b99dc178d43d1e4e1a3f766bf7f
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=aa0b671422880b99dc178d43d1e4e1a3f766bf7f

Author: Michal Krol mic...@vmware.com
Date: Fri Feb 12 13:32:35 2010 +0100

util: Fix descriptors for R32_FLOAT and R32G32_FLOAT formats.

Michal,

Is this more like two different users expecting two different results in
those unused columns?

In particular, we definitely require the missing elements to be extended
to (0,0,0,1) when fetching vertex data, and probably also in OpenGL
texture sampling (if we supported these formats for that).

Gallium should follow D3D rules, so I've been following D3D here. Also,
util_unpack_color_ub() in u_pack_color.h already sets the remaining
fields to 0xff.

Note that D3D doesn't have the problem with expanding vertex attribute
data since you can't have X or XY vertex positions, only XYZ (with W
extended to 1 as in GL) and XYZW.

But surely D3D permits two-component texture coordinates, which would be
PIPE_FORMAT_R32G32_FLOAT, and expanded as (r,g,0,1)...

Brian added a table of differences between GL and other APIs recently to
gallium/docs - does your change agree with that?

Where's that exactly, I can't find it?

It seems like we'd want to be able to support both usages - the
alternative in texture sampling would be forcing the state tracker to
generate varients of the shader when 2-component textures are bound. I
would say that's an unreasonable requirement on the state tracker.

It seems like in GL would want (0,0,0,1) expansion everywhere, but D3D
would want differing expansions in different parts of the pipeline.
That indicates a single flag in the context somewhere isn't sufficient
to choose between the two.

Maybe there need to be two versions of these PIPE_FORMAT_ enums to
capture the different values in the missing components?

EG:

PIPE_FORMAT_R32G32_0001_FLOAT
PIPE_FORMAT_R32G32__FLOAT

? or something along those lines??

You are right.

Alternatively, follow the more sane API (GL apparently), assume 0001 as
default and use the infix to override.

Note it's not just GL. D3D10 uses same expansion. Only D3D9 is
different. Well for texture sampling anyway, I don't know what d3d does
for vertex formats.

Though for most hardware it would make sense to have only one format per
different expansion, and use some swizzling parameter for sampling,
because that's actually how the hardware works. But not all drivers will
be able to do this, unfortunately.
(Note that for instance, with i965, those two R32G32 formats mentioned
here aren't really freely selectable. In OGL/DX10 mode you'll get the
former, in d3d9 mode you get the latter. You can switch the mode but
you'll also get different border color interpretation along with it -
something which is also not specified in gallium, though I guess you
could say this is tied to gl_rasterization_rules - maybe we could say
the same too, R32G32 is rg01 with gl_rasterization_rules and rg11
without? Seems a bit hackish, though.).

Roland

--
SOLARIS 10 is the OS for Data Centers - provides features such as DTrace,
Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW
http://p.sf.net/sfu/solaris-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] Mesa (master): util: Fix descriptors for R32_FLOAT and R32G32_FLOAT formats .

2010-02-12 Thread Roland Scheidegger

On 12.02.2010 18:42, Keith Whitwell wrote:
On Fri, 2010-02-12 at 09:28 -0800, José Fonseca wrote:
On Fri, 2010-02-12 at 06:43 -0800, Roland Scheidegger wrote:
On 12.02.2010 14:44, michal wrote:
Keith Whitwell wrote on 2010-02-12 14:28:
On Fri, 2010-02-12 at 05:09 -0800, michal wrote:

Keith Whitwell wrote on 2010-02-12 13:39:

On Fri, 2010-02-12 at 04:32 -0800, Micha?? Kr??l wrote:

Module: Mesa
Branch: master
Commit: aa0b671422880b99dc178d43d1e4e1a3f766bf7f
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=aa0b671422880b99dc178d43d1e4e1a3f766bf7f

Author: Michal Krol mic...@vmware.com
Date: Fri Feb 12 13:32:35 2010 +0100

util: Fix descriptors for R32_FLOAT and R32G32_FLOAT formats.

Michal,

Is this more like two different users expecting two different results in
those unused columns?

In particular, we definitely require the missing elements to be extended
to (0,0,0,1) when fetching vertex data, and probably also in OpenGL
texture sampling (if we supported these formats for that).

Gallium should follow D3D rules, so I've been following D3D here. Also,
util_unpack_color_ub() in u_pack_color.h already sets the remaining
fields to 0xff.

Note that D3D doesn't have the problem with expanding vertex attribute
data since you can't have X or XY vertex positions, only XYZ (with W
extended to 1 as in GL) and XYZW.

But surely D3D permits two-component texture coordinates, which would be
PIPE_FORMAT_R32G32_FLOAT, and expanded as (r,g,0,1)...

Brian added a table of differences between GL and other APIs recently to
gallium/docs - does your change agree with that?

Where's that exactly, I can't find it?

Maybe there need to be two versions of these PIPE_FORMAT_ enums to
capture the different values in the missing components?

EG:

PIPE_FORMAT_R32G32_0001_FLOAT
PIPE_FORMAT_R32G32__FLOAT

? or something along those lines??

You are right.

Alternatively, follow the more sane API (GL apparently), assume 0001 as
default and use the infix to override.
Note it's not just GL. D3D10 uses same expansion. Only D3D9 is
different. Well for texture sampling anyway, I don't know what d3d does
for vertex formats.

It sounds a good idea.

In the worst case some component will inevitably need to make shader
variants with different swizzles. In this case it probably makes sense
to be the pipe driver -- it's a tiny shader variation which could be
done without recompiling the whole shader, but if the state tracker does
it then the pipe driver will always have to recompile.

In the best case it is handled by the hardware's texture sampling unit.

It's in theory similar to baking the swizzle in the format as Keith
suggested, but cleaner IMHO. The question is whether it makes sense to
have full xwyz01 swizzles, or just 01 swizzles.

Another alternative is to just add the behaviour we really need - a
single flag at context creation time that says what the behaviour of the
sampler should be for these textures.

Then the driver wouldn't have to worry about varients or mixing two
different expansions. Hardware (i965 at least) seems to have one global
mode to switch between these, and that's all we need to choose the right
behaviour for each state tracker.

It might be simpler all round just to specify it at context creation.

Yes, for rg01 vs rg11 this is easiest. It doesn't solve the depth
texture mode problem though.
Also, we sort of have that flag already, I think there's no reason why
this needs to be separate from gl_rasterization_rules (though I guess in
that case it's a bit a misnomer...)

Roland

Re: [Mesa3d-dev] Mesa (master): util: Fix descriptors for R32_FLOAT and R32G32_FLOAT formats .

2010-02-12 Thread Roland Scheidegger

On 12.02.2010 19:00, Keith Whitwell wrote:
On Fri, 2010-02-12 at 09:56 -0800, Roland Scheidegger wrote:
On 12.02.2010 18:42, Keith Whitwell wrote:
On Fri, 2010-02-12 at 09:28 -0800, José Fonseca wrote:
On Fri, 2010-02-12 at 06:43 -0800, Roland Scheidegger wrote:
On 12.02.2010 14:44, michal wrote:
Keith Whitwell wrote on 2010-02-12 14:28:
On Fri, 2010-02-12 at 05:09 -0800, michal wrote:

Keith Whitwell wrote on 2010-02-12 13:39:

On Fri, 2010-02-12 at 04:32 -0800, Micha?? Kr??l wrote:

Module: Mesa
Branch: master
Commit: aa0b671422880b99dc178d43d1e4e1a3f766bf7f
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=aa0b671422880b99dc178d43d1e4e1a3f766bf7f

Author: Michal Krol mic...@vmware.com
Date: Fri Feb 12 13:32:35 2010 +0100

util: Fix descriptors for R32_FLOAT and R32G32_FLOAT formats.

Michal,

Is this more like two different users expecting two different results
in
those unused columns?

In particular, we definitely require the missing elements to be
extended
to (0,0,0,1) when fetching vertex data, and probably also in OpenGL
texture sampling (if we supported these formats for that).

Gallium should follow D3D rules, so I've been following D3D here.
Also,
util_unpack_color_ub() in u_pack_color.h already sets the remaining
fields to 0xff.

Note that D3D doesn't have the problem with expanding vertex attribute
data since you can't have X or XY vertex positions, only XYZ (with W
extended to 1 as in GL) and XYZW.

But surely D3D permits two-component texture coordinates, which would be
PIPE_FORMAT_R32G32_FLOAT, and expanded as (r,g,0,1)...

Brian added a table of differences between GL and other APIs recently
to
gallium/docs - does your change agree with that?

Where's that exactly, I can't find it?

Maybe there need to be two versions of these PIPE_FORMAT_ enums to
capture the different values in the missing components?

EG:

PIPE_FORMAT_R32G32_0001_FLOAT
PIPE_FORMAT_R32G32__FLOAT

? or something along those lines??

You are right.

It sounds a good idea.

In the best case it is handled by the hardware's texture sampling unit.

It's in theory similar to baking the swizzle in the format as Keith
suggested, but cleaner IMHO. The question is whether it makes sense to
have full xwyz01 swizzles, or just 01 swizzles.
Another alternative is to just add the behaviour we really need - a
single flag at context creation time that says what the behaviour of the
sampler should be for these textures.

It might be simpler all round just to specify it at context creation.
Yes, for rg01 vs rg11 this is easiest. It doesn't solve the depth
texture mode problem though.
Also, we sort of have that flag already, I think there's no reason why
this needs to be separate from gl_rasterization_rules (though I guess in
that case it's a bit a misnomer...)

I'd prefer to avoid a big I'm a GL/DX9 context flag, and split
different behaviours into different flags. Sure, a GL state tracker
might set them all one way, but that doesn't mean some future
state-tracker wouldn't want to use a novel combination.

The GL rasterization rules flag should be renamed to reflect what it's
really asking for.

[Mesa3d-dev] nouveau changes for gallium-dynamicstencilref

2010-02-11 Thread Roland Scheidegger

Hi,

could one of the nouveau developers please take a look at the nv30
changes I did for the stencil ref changes in gallium-dynamicstencilref
branch?
I've just done that in a way I think it might make sense, but I've
absolutely no idea if it would work like that (and even if it would in
theory there might of course still be bugs in it...)
Also, I was a bit confused about the so_new() parameters as the numbers
didn't seem to add up (assuming it's basically the max number of
so_method and so_data calls).
Anyway, if it makes sense I can do nv40/nv50 too, if not tell me what
needs to be done instead or do it yourself :-). Or it will break after
the merge...

Roland

--
SOLARIS 10 is the OS for Data Centers - provides features such as DTrace,
Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW
http://p.sf.net/sfu/solaris-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] nouveau changes for gallium-dynamicstencilref

2010-02-11 Thread Roland Scheidegger

On 11.02.2010 21:42, Christoph Bumiller wrote:
 On 02/11/2010 09:02 PM, Roland Scheidegger wrote:
 Hi,

 could one of the nouveau developers please take a look at the nv30
 changes I did for the stencil ref changes in gallium-dynamicstencilref
 branch?
 I've just done that in a way I think it might make sense, but I've
 absolutely no idea if it would work like that (and even if it would in
 theory there might of course still be bugs in it...)
 Looks like it should work, I can't test nv30 myself though.
 
 Also, I was a bit confused about the so_new() parameters as the numbers
 didn't seem to add up (assuming it's basically the max number of
 so_method and so_data calls).
 It's (nr of so_method, nr of so_data + nr of so_reloc, nr of so_reloc),
 since relocs/addresses are considered data.
Ok that's what I figured. The numbers were just wrong, then (nv30 used
5/21/0 but the actual max was 4/22/0, nv40 uses 4/21/0 and it should
also be 4/22/0). Hence the confusion. At least things looked ok for nv50
(though it seems to needlessly split up the back face state into two
so_methods - anyway that'll change).

 
 Anyway, if it makes sense I can do nv40/nv50 too, if not tell me what
 needs to be done instead or do it yourself :-). Or it will break after
 the merge...

 I think you'll get it right. nv40 should be about the same as nv30,
 nv50 state is a bit less elegant but should be easy to adjust, too.

Ok.

Roland


--
SOLARIS 10 is the OS for Data Centers - provides features such as DTrace,
Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW
http://p.sf.net/sfu/solaris-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] fix the usual cell breakage

2010-02-08 Thread Roland Scheidegger

On 06.02.2010 15:07, Marc Dietrich wrote:
 also update the cell config a bit
 ---
  configs/linux-cell |6 ++--
  src/gallium/drivers/cell/common.h  |3 +-
  src/gallium/drivers/cell/spu/spu_per_fragment_op.c |   36 
 ++--
  3 files changed, 22 insertions(+), 23 deletions(-)

Sorry for that. Got confused there, thought the driver was using
cell_blend_state and not pipe_blend_state there...
cell_blend_state though actually seems to be only some remains from the
past...

Roland

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] [RFC]: gallium-nopointsizeminmax merge

2010-02-08 Thread Roland Scheidegger

On 08.02.2010 18:27, Brian Paul wrote:
 On Mon, Feb 8, 2010 at 10:21 AM, Roland Scheidegger srol...@vmware.com 
 wrote:
 This branch removes point_size_min and point_size_max because most
 hardware doesn't have any register to clamp this at rasterization time
 (from all gallium drivers, only r300 had this), and the mesa state
 tracker actually never used these field properly. The clamp to
 implementation limits will now be done in the vertex shader instead.
 Also, point_sprite enable is removed and replaced with a
 point_quad_rasterization field. The reason for this is that OGL actually
 has quite different rasterization rules for points and point sprites -
 hence this indicates if points should be rasterized as points or
 according to point sprite (which decomposes them into quads, basically)
 rules. It is unclear to me if we'd actually really need to do something
 different for these rules in the draw module or if hardware can do much
 with this information, but if there's hardware which can well you can
 use it.
 The point sprite coord enable is no longer also indicating the sprite
 coord origin, since there's no api interested in this per coord.

 Testing was done with softpipe, and actually pointblast doesn't work
 (does not draw any points at all), neither does spriteblast (some points
 have their size cut somewhere vertically so they are rectangles)
 correctly. However, these bugs are not introduced by this branch, those
 must be bugs in the draw module present before - I'm still trying to
 figure out what goes wrong.
 
 They're OK on master.  I fixed some breakage in this area last week.
 See 54d7ec8e769b588ec93dea5bc04399e91737557e for example.

Yes you're right, I missed that (probably because it wasn't in the draw
module, but the actual drivers). pointblast works again, and spriteblast
actually wasn't broken (parts of points simply disappeared due to depth
test...) when cherrypicking those commits to the branch.

Roland


--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] Gallium DRI fbconfig/visual setup

2010-02-05 Thread Roland Scheidegger

On 05.02.2010 22:48, Corbin Simpson wrote:
 Two things...
 
 Are accumbufs still slow in Gallium-land? Should we still mark them as slow?
 
 How many multisamples should we actually pretend/advertise? Should we
 have a cap to check the number of multisamples supported? Should we
 just say that four samples are done for the fbconfig/visual, and then
 replace pipe_texture::nr_samples with a multisample boolean flag?

I think it would be nice if we could support multiple MSAA levels. Sure
hardware typically can do 4xMSAA, but maybe you'd really want max
quality (with modern hw often offering 8x, and that's not taking
specialties like csaa into account) or only 2x.
Maybe cap should return a bitmask indicating what levels it supports.

Roland

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] [RFC] gallium-cylindrical-wrap branch

2010-02-03 Thread Roland Scheidegger

On 03.02.2010 16:07, michal wrote:
 Keith,
 
 This feature branch adds cylindrical wrap texcoord mode to gallium 
 shader tokens and removes prefilter field from sampler state. 
 Implemented cylindrical wrapping for linear interpolator in softpipe. 
 Not sure whether it makes sense to do it for perspective interpolator. 
 Documented TGSI declaration token.
 
 Sample fragment shader declaration that wraps S and T coordinates follows.
 
 DCL INPUT[0], GENERIC[0], LINEAR, CYLWRAP_XY
 
 Please review so I can merge it to master.

Michal,

why do you need this for linear interpolator and not perspective? I
think d3d mobile let you disable perspective correct texturing, but it
is always enabled for normal d3d.

Roland

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] [RFC] gallium-cylindrical-wrap branch

2010-02-03 Thread Roland Scheidegger

On 03.02.2010 17:45, michal wrote:
 Roland Scheidegger wrote on 2010-02-03 16:47:
 On 03.02.2010 16:07, michal wrote:
   
 Keith,

 This feature branch adds cylindrical wrap texcoord mode to gallium 
 shader tokens and removes prefilter field from sampler state. 
 Implemented cylindrical wrapping for linear interpolator in softpipe. 
 Not sure whether it makes sense to do it for perspective interpolator. 
 Documented TGSI declaration token.

 Sample fragment shader declaration that wraps S and T coordinates follows.

 DCL INPUT[0], GENERIC[0], LINEAR, CYLWRAP_XY

 Please review so I can merge it to master.
 
 Michal,

 why do you need this for linear interpolator and not perspective? I
 think d3d mobile let you disable perspective correct texturing, but it
 is always enabled for normal d3d.
   
 I could not think of a use case that uses perspective and cylindrical 
 interpolation at the same time. If you think it's valid, we can 
 implement cylindrical wrapping for perspective interpolator, but then I 
 am not sure how exactly it should be done, i.e. should we divide and 
 then wrap or the opposite?

Good question. Unfortunately the description of what the wrap
renderstate does doesn't say anything about that. I just assumed since
perspective correction is usually always enabled, it would be enabled
even when wrapping is used on some coordinates. Not sure what the order
of wrap/divide should be... Also, d3d lets you wrap the 4th coordinate,
does that really make sense?

Roland


--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] Grab bag of random questions (whoo)

2010-02-01 Thread Roland Scheidegger

On 31.01.2010 18:41, Christoph Bumiller wrote:
 On 31.01.2010 01:37, Roland Scheidegger wrote:
 Marek Olšák wrote:
   
 6) GL_ARB_shadow_ambient and texture compare-fail values. A comment in
 the fragment constants reads, Since Gallium doesn't support
 GL_ARB_shadow_ambient, this is always (0,0,0,0), right?

 I think the extension could be added to Gallium since the r300 compiler 
 can generate code for it.
 
 It could. But generally, gallium doesn't implement features common 
 hardware can't do (not only because most drivers except software based 
 ones couldn't implement it, but those features also turn out to be 
 rarely used, for obvious reasons). r300 is an exception here since it 
 emulates ARB_shadow anyway. Though I think if you can make a case why 
 this is really necessary it could be done, but that's not my call.

   
 Another
 comment reads, Gallium doesn't provide us with any information
 regarding this mode, so we are screwed. I'm setting 0 = LUMINANCE,
 above the texture compare modes. I don't really like that section of
 code, but it probably can't get cleaner, right?

 Even though this is a rarely used feature in OpenGL nowadays, it should 
 get fixed if we want to be GL-compliant. That means adding depth texture 
 modes in pipe_sampler_state and setting them in the Mesa state tracker. 
 The R300 compiler can already generate code for these modes as well.
 
 Note R300 is again special a bit here.
 Actually, I realized my earlier answer doesn't make sense. Hardware 
 which actually supports EXT_texture_swizzle (and native ARB_shadow) 
 should be able to implement this easily. Hardware like i965 which 
 doesn't support EXT_texture_swizzle could do it in the shader.
 Maybe it would make sense to add EXT_texture_swizzle capability in 
 gallium (in the sampler state). That would solve this in a bit more 
 generic way than some special bits for depth texture mode.

   
 From my point of view adding a swizzle in the sampler state
 is a bad idea.
 
 On nv50, this would make texture setup dependent on
 sampler state: we have an Image and Sampler configuration
 buffer containing entries that can be bound to texture and
 sampler units.
 The texture swizzle would be supported by setting a different
 format in the image entry, like BGRA instead of RGBA, just
 that it also supports RGRG or whatever you like.
 
 Well, the normalization bit seems to be stored in the TIC
 entries instead of the TSC ones already, I guess that comes
 from the rectangle texture type, but let's ignore that.
 
 I don't see texture swizzle in d3d10 (but then, I don't know d3d10
 very well), and OpenGL doesn't separate textures and samplers
 anyway, so I'd put in in texture state.
 Keeping a bunch of shaders for texture swizzles doesn't sound
 nice either.
 
 Of course, if other hardware would prefer this in sampler state,
 then ... ah, I should probably let go of the illusion that gallium state
 will continue to nicely map to my hardware ...

I don't know if other hardware likes it in sampler state, but the
problem is it really is sampler state. This is not a property of the
texture, it can change anytime and you don't want to recreate the
texture just because this changes, I think.
I'm not sure how to implement this nicely neither, but I'd guess we'd at
least wanted to indicate the swizzle fields to correspond to hardware
channels (so for a luminance_alpha texture, the swizzle would indicate
rrrg for the first and second channels respectively), not the
GL-after-sampling mapping as the extension uses.
Hence depth textures used as luminance would be rrr1, as alpha 000r, and
as intensity .
So, basically, a texture a8l8 texture would be equivalent to a r8g8
texture, when used for sampling with the same swizzling.
Note that an easy solution (for depth textures) would be to add just new
depth texture formats (one for each alpha, luminance, and intensity
mode), but then again you make this part of the texture, which it is not.
Those are just some quick thoughts however, I don't think anyone would
be opposed if you can come up with a nice solution for this.


Roland

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] Grab bag of random questions (whoo)

2010-02-01 Thread Roland Scheidegger

On 01.02.2010 20:23, Brian Paul wrote:
 Speaking of texture formats and texture sampling, one area of Gallium 
 that's under-specified is what (x,y,z,w) values are returned by TEX 
 instructions when sampling from each of the various texture formats.
 
 A while back I started a table comparing OpenGL to D3D:
 
 
 texture componentsOpenGL  D3D
 ---
 R,G,B,A   (R,G,B,A)  (R,G,B,A)
 R,G,B (R,G,B,1)  (R,G,B,1)
 R,G   (R,G,0,1)  (R,G,1,1)
 R (R,0,0,1)  (R,1,1,1)
 A (0,0,0,A)  (0,0,0,A)
 L (L,L,L,1)  (L,?,?,1) (probably L,L,L,1)
 I (I,I,I,I)  (?,?,?,?)
 UV(0,0,0,1)* (U,V,1,1)
 Z (Z,Z,Z,Z) or   (0,Z,0,1)
(Z,Z,Z,1) or
(0,0,0,Z)**
 other formats?......
AL. Should be (L,L,L,A) for both OGL And D3D. And yes, (L,L,L,1) is
correct for D3D (that's what i965 at least does). There are no intensity
textures in d3d (unless you can somehow support that via cap bits, but
in that case I'd certainly expect it to be (I,I,I,I)).
UV is of course really odd in OGL, since it says the sample result is
constant but of course you still use the UV components for the bump
target. That's just an oddity to make that fit somehow into fixed
function pipeline rather than it has anything to do with hardware.

Note that the D3D column is only valid for DX9 (and older). DX10 uses
the same mappings as OpenGL (if it supports the format, all luminance,
alpha etc. textures are gone, as are the swizzled bgra formats).

 
 * per http://www.opengl.org/registry/specs/ATI/envmap_bumpmap.txt
 ** depends on GL_DEPTH_TEXTURE_MODE state
 
 For OpenGL, see page 141 of the OpenGL 3.1 spec.
 For D3D, see http://msdn.microsoft.com/en-us/library/ee422472(VS.85).aspx
 
 
 We should first add a column to the above table for Gallium and then 
 decide whether to implement swizzling (and GL_DEPTH_TEXTURE_MODE) with 
 extra GPU instructions or new texture/sampler swizzle state.
But most gpus can do arbitrary swizzling natively, hence inserting gpu
instructions really must be optional. Even hardware which can't do
arbitrary swizzling can sometimes do both OGL and D3D mapping, hence we
don't really want additional instructions there neither (i965 being the
example, though it's not easy to switch behavior, since that affects not
only the format of the border color too but also how the border color is
used if the particular channel isn't in the texture).
I think we'd want DX10/OGL behavior, and u_format defines it that way.
Except for depth/stencil formats, where the depth always ends up in the
red channel and stencil in green (with the rest undefined).
i965 actually has different depth/stencil formats (a24x8, l24x8, i24x8)
just for those depth texture modes (though the code suggests it won't do
anything if shadow comparison is enabled).
Or maybe we'd want additional formats just for DX9 - sounds like
overkill though. The different border color interpretation of i965
suggests to me that won't do much on its own for conformance anyway.
I think the swizzle values used by u_format are nice. Using xyzw rather
than rgba to refer to the first, etc. channel avoids confusion. Hence
I'd propose we'd use the same for the hypothetical sampler swizzle state
(that is x,y,z,w,0,1, not sure if the _ undefined makes sense there).
The swizzling would be the same as that indicated in u_format for all
textures initially, except depth/stencil.

Roland

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] Grab bag of random questions (whoo)

2010-01-30 Thread Roland Scheidegger

On 30.01.2010 13:06, Corbin Simpson wrote:
 Handful of random things bugging me.

 2) progs/tests/drawbuffers and progs/tests/drawbuffers2, and possibly
 others, segfault with both softpipe and the HW driver at
 sl_pp_version.c:45. I think there's some codegen going on there? At
 any rate, if anybody has any hints on how to solve it, that'd be nice.
Works for me (with softpipe).

 
 6) GL_ARB_shadow_ambient and texture compare-fail values. A comment in
 the fragment constants reads, Since Gallium doesn't support
 GL_ARB_shadow_ambient, this is always (0,0,0,0), right?
I'd think so. This extension isn't in core GL (and d3d can't do it
neither), and AFAIK there's no hardware (which doesn't emulate the
shadow functionality in the fragment shader) which could actually do it.

 Another
 comment reads, Gallium doesn't provide us with any information
 regarding this mode, so we are screwed. I'm setting 0 = LUMINANCE,
 above the texture compare modes. I don't really like that section of
 code, but it probably can't get cleaner, right?
Yes, that's not very clean, but there doesn't seem to be an easy
solution for this. Exposing this in gallium only seems marginally
useful, since again modern hardware can't really do anything useful with
that information neither. Maybe would need to tweak the shader if
actually the wrong channels are used (probably shouldn't be the
driver's responsibility to do this), but I guess assuming default
LUMINANCE works just fine usually. New OGL won't have that problem...

 
 7) Is there more information on the dual-source blend modes? I'm not
 sure if I can do them; might have to bug AMD for the register values.
Pretty sure most pre-DX10 hardware can't do that, the blend unit just
doesn't have access to multiple source colors.
I've attached a small patch which shows how softpipe implements it.
(But I still need to write a testcase probably for the python
statetracker to see if it actually works...).
Pretty easy really, just using pixel shader color outputs 0 and 1 for
blending (note that in this form this restricts dual source blending to
one render target, this is the same restriction DX10 enforces, and if
you look at i965 docs it actually has this restriction in hardware).

 
 I think that's it for now. Sorry for all the questions, but I'm really
 starting to get a good handle on the hardware and interface, and I'm
 ready to start beating the classic driver in serious benchmarks; I
 think that r300's probably the most mature driver alongside nv50 and
 maybe nv40.
Great!

Roland
diff --git a/src/gallium/drivers/softpipe/sp_quad_blend.c b/src/gallium/drivers/softpipe/sp_quad_blend.c
index d65307b..85fda0b 100644
--- a/src/gallium/drivers/softpipe/sp_quad_blend.c
+++ b/src/gallium/drivers/softpipe/sp_quad_blend.c
@@ -222,7 +222,7 @@ logicop_quad(struct quad_stage *qs,
 
 static void
 blend_quad(struct quad_stage *qs, 
-   float (*quadColor)[4],
+   float (*quadColors)[4][4],
float (*dest)[4],
unsigned cbuf)
 {
@@ -230,6 +230,7 @@ blend_quad(struct quad_stage *qs,
static const float one[4] = { 1, 1, 1, 1 };
struct softpipe_context *softpipe = qs-softpipe;
float source[4][QUAD_SIZE] = { { 0 } };
+   float (*quadColor)[4] = quadColors[cbuf];
 
/*
 * Compute src/first term RGB
@@ -298,11 +299,23 @@ blend_quad(struct quad_stage *qs,
}
break;
case PIPE_BLENDFACTOR_SRC1_COLOR:
-  assert(0); /* to do */
-  break;
+   {
+  float (*quadColor1)[4] = quadColors[1];
+  assert(cbuf == 0);
+  VEC4_MUL(source[0], quadColor[0], quadColor1[0]); /* R */
+  VEC4_MUL(source[1], quadColor[1], quadColor1[1]); /* G */
+  VEC4_MUL(source[2], quadColor[2], quadColor1[2]); /* B */
+   }
+   break;
case PIPE_BLENDFACTOR_SRC1_ALPHA:
-  assert(0); /* to do */
-  break;
+   {
+  const float *alpha = quadColors[1][3];
+  assert(cbuf == 0);
+  VEC4_MUL(source[0], quadColor[0], alpha); /* R */
+  VEC4_MUL(source[1], quadColor[1], alpha); /* G */
+  VEC4_MUL(source[2], quadColor[2], alpha); /* B */
+   }
+   break;
case PIPE_BLENDFACTOR_ZERO:
   VEC4_COPY(source[0], zero); /* R */
   VEC4_COPY(source[1], zero); /* G */
@@ -372,11 +385,29 @@ blend_quad(struct quad_stage *qs,
}
break;
case PIPE_BLENDFACTOR_INV_SRC1_COLOR:
-  assert(0); /* to do */
-  break;
+   {
+  float (*quadColor1)[4] = quadColors[1];
+  float inv_comp[4];
+  assert(cbuf == 0);
+  VEC4_SUB(inv_comp, one, quadColor1[0]); /* R */
+  VEC4_MUL(source[0], quadColor[0], inv_comp); /* R */
+  VEC4_SUB(inv_comp, one, quadColor1[1]); /* G */
+  VEC4_MUL(source[1], quadColor[1], inv_comp); /* G */
+  VEC4_SUB(inv_comp, one, quadColor1[2]); /* B */
+  VEC4_MUL(source[2], quadColor[2], inv_comp); /* B */
+   }
+   break;
case PIPE_BLENDFACTOR_INV_SRC1_ALPHA:
-  assert(0); /* to do */
-  break;
+   {
+  const float *alpha = quadColors[1][3];
+  float

Re: [Mesa3d-dev] Grab bag of random questions (whoo)

2010-01-30 Thread Roland Scheidegger

Marek Olšák wrote:
 6) GL_ARB_shadow_ambient and texture compare-fail values. A comment in
 the fragment constants reads, Since Gallium doesn't support
 GL_ARB_shadow_ambient, this is always (0,0,0,0), right?
 
 I think the extension could be added to Gallium since the r300 compiler 
 can generate code for it.
It could. But generally, gallium doesn't implement features common 
hardware can't do (not only because most drivers except software based 
ones couldn't implement it, but those features also turn out to be 
rarely used, for obvious reasons). r300 is an exception here since it 
emulates ARB_shadow anyway. Though I think if you can make a case why 
this is really necessary it could be done, but that's not my call.

 
 Another
 comment reads, Gallium doesn't provide us with any information
 regarding this mode, so we are screwed. I'm setting 0 = LUMINANCE,
 above the texture compare modes. I don't really like that section of
 code, but it probably can't get cleaner, right?
 
 Even though this is a rarely used feature in OpenGL nowadays, it should 
 get fixed if we want to be GL-compliant. That means adding depth texture 
 modes in pipe_sampler_state and setting them in the Mesa state tracker. 
 The R300 compiler can already generate code for these modes as well.
Note R300 is again special a bit here.
Actually, I realized my earlier answer doesn't make sense. Hardware 
which actually supports EXT_texture_swizzle (and native ARB_shadow) 
should be able to implement this easily. Hardware like i965 which 
doesn't support EXT_texture_swizzle could do it in the shader.
Maybe it would make sense to add EXT_texture_swizzle capability in 
gallium (in the sampler state). That would solve this in a bit more 
generic way than some special bits for depth texture mode.

 
 7) Is there more information on the dual-source blend modes? I'm not
 sure if I can do them; might have to bug AMD for the register values.
 
 I bet R300 can't do these modes. It's only a Direct3D 10.0 feature, not 
 present in Direct3D 10.1. MS must have a good reason to remove it.
Where did you see that it's removed in 10.1?
Here's a list of blend ops in d3d11:
http://msdn.microsoft.com/en-us/library/ee416042(VS.85).aspx
Note this feature can be present (via cap bits in some limited form) in 
D3D9Ex too, and I thought windows actually used it for (antialiased) 
text rendering (but don't quote me on that).

 
 BTW I looked at some of your patches and r3xx-r5xx cards don't even 
 support separate blend enables, therefore the cap should be 0. Or are 
 you going to emulate this using independent color channel masks and two 
 rendering passes? That could be done in the state tracker. Also, I think 
 the indep. color masks are r5xx-only.
I also think even r500 shouldn't say this is supported. Just changing 
the colormasks isn't going to be very correct...

Roland


--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] Grab bag of random questions (whoo)

2010-01-30 Thread Roland Scheidegger

Corbin Simpson wrote:
 
 Another
 comment reads, Gallium doesn't provide us with any information
 regarding this mode, so we are screwed. I'm setting 0 = LUMINANCE,
 above the texture compare modes. I don't really like that section of
 code, but it probably can't get cleaner, right?
 Yes, that's not very clean, but there doesn't seem to be an easy
 solution for this. Exposing this in gallium only seems marginally
 useful, since again modern hardware can't really do anything useful with
 that information neither. Maybe would need to tweak the shader if
 actually the wrong channels are used (probably shouldn't be the
 driver's responsibility to do this), but I guess assuming default
 LUMINANCE works just fine usually. New OGL won't have that problem...
 
 New OGL? GL3? Sweet.
Well, all the luminance/intensity stuff goes away, so problem solved 
:-). Even when mesa will get to GL3 eventually, we'd still need to deal 
with that for ARB_compatibility however at least in theory.
See also my answer in the other email, I was quite wrong that hardware 
typically can't do much with it.

Roland

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] [PATCH] hack around commas in macro argument

2010-01-26 Thread Roland Scheidegger

On 26.01.2010 09:18, Marvin wrote:
 Jose, Brian,
 
 Marc,

 Why is this necessary? It has been working fine so far. Which gcc version
  are you using? What commas are you referring to?
 
 the PIPE_ALIGN_TYPE macro is so far only used in the cell driver in 
 src/gallium/drivers/cell/spu/spu_main.c  (this is probably why no one noticed 
 it).
 
 The marco takes a type, a stuct in this case, which can include commas:
 
 PIPE_ALIGN_TYPE(16,
 struct spu_framebuffer
 {
void *color_start;  /** addr of color surface in main memory 
 */
void *depth_start;  /** addr of depth surface in main memory 
 */
enum pipe_format color_format;
enum pipe_format depth_format;
uint width, height; /** size in pixels */
 ^^^
 
uint width_tiles, height_tiles; /** width and height in tiles */
   ^^^
 
uint color_clear_value;
uint depth_clear_value;
 
uint zsize; /** 0, 2 or 4 bytes per Z */
float zscale;   /** 65535.0, 2^24-1 or 2^32-1 */
 });
 
 This will cause a problem, as the macro will thread each comma as an argument 
 seperator and thus the number of arguments is larger than 2.

Hmm, maybe could just avoid the problem by not using commas in the
struct declaration?

Roland


--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

[Mesa3d-dev] perrtblend merge

2010-01-26 Thread Roland Scheidegger

Hi,

I'm planning on merging this branch to master soon.
This will make it possible to do per render target blend enables,
colormasks, and also per rendertarget blend funcs (with a different CAP
bit for the latter, and this one isn't actually used in mesa state
tracker yet).
None of the drivers other than softpipe implement any of it, but they
were adapted to the interface changes so should continue to run.
Apparently, that functionality is only interesting for drivers
supporting multiple render targets, and the hw probably needs to be
quite new (I know that i965 could support it (well not the multiple
blend funcs but the rest), but the driver currently only supports 1
render target).

Roland

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] perrtblend merge

2010-01-26 Thread Roland Scheidegger

Oh, I should have added the PIPE_CAP bits (even if not supported) to all
drivers. Good catch. I'll do that for the other drivers now.

Roland

(btw, I think r500 could do separate colormasks, but not separate blend
enables, and there might be more hardware like that. However, this is
not exposed by GL, it might be supported by some DX9 cap bit, but it
didn't seem worthwile to add a separate gallium cap bit for supporting
per-rt blend enables and colormasks, respectively.)

On 26.01.2010 16:37, Corbin Simpson wrote:
 Yeah, r300 doesn't but r600 does. I've read through the branch, and
 the r300g patch looks perfect. I've pushed another patch on top for
 the pipe caps, to avoid post-merge cleanups for myself.
 
 On Tue, Jan 26, 2010 at 7:00 AM, Alex Deucher alexdeuc...@gmail.com wrote:
 On Tue, Jan 26, 2010 at 9:44 AM, Roland Scheidegger srol...@vmware.com 
 wrote:
 Hi,

 I'm planning on merging this branch to master soon.
 This will make it possible to do per render target blend enables,
 colormasks, and also per rendertarget blend funcs (with a different CAP
 bit for the latter, and this one isn't actually used in mesa state
 tracker yet).
 None of the drivers other than softpipe implement any of it, but they
 were adapted to the interface changes so should continue to run.
 Apparently, that functionality is only interesting for drivers
 supporting multiple render targets, and the hw probably needs to be
 quite new (I know that i965 could support it (well not the multiple
 blend funcs but the rest), but the driver currently only supports 1
 render target).
 FWIW, AMD R6xx+ hw supports MRTs and per-MRT blends as well, although
 at the moment the driver also only supports 1 RT.

 Alex

 --
 The Planet: dedicated and managed hosting, cloud storage, colocation
 Stay online with enterprise data centers and the best network in the business
 Choose flexible plans and management services without long-term contracts
 Personal 24x7 support from experience hosting pros just a phone call away.
 http://p.sf.net/sfu/theplanet-com
 ___
 Mesa3d-dev mailing list
 Mesa3d-dev@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

 
 
 


--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] What about gl_rasterization_rules?

2010-01-21 Thread Roland Scheidegger

On 21.01.2010 18:47, Luca Barbieri wrote:
 On Thu, Jan 21, 2010 at 6:34 PM, Corbin Simpson
 mostawesomed...@gmail.com wrote:
 Maybe it's just me, since I actually wrote the docs, but does anybody
 else read them?

 From cso/rasterizer.html (viewable at e.g.
 http://people.freedesktop.org/~csimpson/gallium-docs/cso/rasterizer.html
 ):

 gl_rasterization_rules
Whether the rasterizer should use (0.5, 0.5) pixel centers. When
 not set, the rasterizer will use (0, 0) for pixel centers.

 So why aren't these patches using 
 pipe_rasterizer_state::gl_rasterization_rules?
 
 It's a different thing.
 gl_rasterization_rules affects the way fragments are rasterized, i.e.
 the set of fragments which a primitive is mapped to.
 Changing it is equivalent to adding/subtracting a subpixel offset to
 the viewport (which seemingly depends on the primitive type).
 
 The pixel center convention instead sets how the values look like in
 the fragment shader.
 Changing it is equivalent to adding/subtracting 0.5 to the
 fragment.position in the fragment shader.
 
 In other words, yes, if you set gl_rasterization_rules and the pixel
 center in a mismatched way, fragment.position will not be the
 coordinate of the rasterization center.
 
 As another example, suppose you do a blit with the 3D engine using
 fragment.position to sample from a texture rectangle with bilinear
 filtering.
 A wrong rasterization convention may cause 1 pixel black bars at the borders.
 A wrong pixel center convention will cause a 2x2 blur filter to be
 applied to the texture.
 
 BTW, gl_rasterization_rules is ignored by almost all drivers

Most but not all. Not the software based ones, for instance. Should be
easy to add to r300 (and the nouveau ones, I assume), I guess these
simply don't care enough about environments with different (= DX9)
rasterization rules :-).

Roland

 
From the spec:
 
 The scope of this extension deals *only* with how the fragment
 coordinate XY location appears during programming fragment processing.
 Beyond the scope of this extension are coordinate conventions used
 for rasterization or transformation.
 
 --
 Throughout its 18-year history, RSA Conference consistently attracts the
 world's best and brightest in the field, creating opportunities for Conference
 attendees to learn about information security's most important issues through
 interactions with peers, luminaries and emerging and established companies.
 http://p.sf.net/sfu/rsaconf-dev2dev
 ___
 Mesa3d-dev mailing list
 Mesa3d-dev@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


--
Throughout its 18-year history, RSA Conference consistently attracts the
world's best and brightest in the field, creating opportunities for Conference
attendees to learn about information security's most important issues through
interactions with peers, luminaries and emerging and established companies.
http://p.sf.net/sfu/rsaconf-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] [RFC] gallium-multiple-constant-buffers merge

2010-01-21 Thread Roland Scheidegger

On 21.01.2010 20:20, michal wrote:
 Hi,
 
 This simple feature branch adds support for two-dimensional constant 
 buffers in TGSI.
 
 An example shader would look like this:
 
 FRAG
 
 DCL IN[0], COLOR, LINEAR
 DCL OUT[0], COLOR
 DCL CONST[1][1..2]
 
 MAD OUT[0], IN[0], CONST[1][2], CONST[1][1]
 
 END
 
 For this to work, one needs to bind a buffer to slot nr 1 containing at 
 least 3 vectors.
 

Looks good to me - I wondered how you'd use the multiple constant
buffers possible by the gallium interface, and that is how :-).
Is that something we'd need a cap bit for in the future? Would this be
also used by ARB_uniform_buffer_object / GL 3.1?

Roland

--
Throughout its 18-year history, RSA Conference consistently attracts the
world's best and brightest in the field, creating opportunities for Conference
attendees to learn about information security's most important issues through
interactions with peers, luminaries and emerging and established companies.
http://p.sf.net/sfu/rsaconf-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] [PATCH 2/2] st: don't assert on empty fragment program

2010-01-18 Thread Roland Scheidegger

On 18.01.2010 19:15, Luca Barbieri wrote:
 Breakpoint 3, _mesa_ProgramStringARB (target=34820, format=34933,
 len=70, string=0x85922ba) at shader/arbprogram.c:434
 434  GET_CURRENT_CONTEXT(ctx);
 $31 = 0x85922ba !!ARBfp1.0\n\nOPTION
 ARB_precision_hint_fastest;\n\n\n\nEND\n
 
 Not sure why Sauerbraten does this, but it does, at least on my system
 (Ubuntu Karmic, nv40 driver) and it should be legal.

Probably depth writes only enabled for things like shadows?

Roland

--
Throughout its 18-year history, RSA Conference consistently attracts the
world's best and brightest in the field, creating opportunities for Conference
attendees to learn about information security's most important issues through
interactions with peers, luminaries and emerging and established companies.
http://p.sf.net/sfu/rsaconf-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] Gallium feature levels

2010-01-12 Thread Roland Scheidegger

On 11.01.2010 22:03, Zack Rusin wrote:
 On Monday 11 January 2010 15:17:00 Roland Scheidegger wrote:
 - extra mirror wrap modes - i don't think mirror repeat was ever
 supported and mirror clamp was removed in d3d10 but it seems that some
 hardware kept support for those
 Mirror repeat is a core feature in GL since 1.4 hence we can't just drop
 it. 
 
 I wasn't suggesting that. I was just pointing out what happens with it from 
 the D3D side.
 
 I think all hardware we'd ever care about would support it. mirror
 clamp / mirror clamp to edge are only an extension, though
 (ATI_texture_mirror_once). (I think the dx mirror once definition is
 probably mirror_clamp_to_edge in opengl parlance).
 
 That's possible. As mentioned I'm not really sure what to do with this 
 feature.
 
 - shadow maps - it's more of an researched guess since it's largely
 based on a format support, but as far as i can tell all d3d10 hardware
 supports it, earlier it varies (e.g. nvidia did it for ages)
 Required for GL 1.4. I thought it was pretty much required for d3d
 sm2.0, though you're right you could probably just not support the
 texture format there. Anyway, most hardware should support it, I believe
 even those which didn't really supported it at DX9 SM 2.0 time supported
 it (chips like radeon r300 lacked the hw to do the comparison in the
 texture unit, but it can be more or less easily implemented in the pixel
 shader, though the implementation will suck as it certainly won't do PCF
 just use some point sampling version - unless you're willing to do a
 much more complex implementation in the pixel shader, but then on this
 generation of hardware you might exceed maximum shader length). I
 believe all hardware supporting SM 2.0 could at least do some sampling
 of depth textures, though possibly only 16 bit and I'm not sure
 filtering worked in all cases.
 
 Yes, but the issue is that I'm not sure how to represent it from a feature 
 level case. Are you saying we should just enable it for all feature levels? 
 That'd be nice.
Hmm, maybe.

 
 I think the other stuff is acceptable. Take a look at the docs and let me
 know what you think.
 What is feature level 1 useful for? I thought we'd really wanted DX9
 level functionality as a bare minimum. GL2.x certainly needs cards
 supporting shader model 2 (and that is a cheat, in reality it would be
 shader model 3).
 
 The main issue was having something without hardware vs in the feature 
 levels. 
 It was supposed to be whatever the current i915 driver currently supports, 
 but 
 yea, I think it doesn't make sense and level 2 should be minumum.
 
 Also, I don't quite get the shader model levels. I thought there were
 mainly two different DX9 versions, one with sm 2.0 the other with 3.0,
 with noone caring about other differences (as most stuff was cappable
 anyway). However, you've got 3 and all of them have 2.0 shader model?
 
 As mentioned this is based on the D3D feature level concept. It's the first 
 link I put in the the references:
 http://msdn.microsoft.com/en-us/library/ee422086(VS.85).aspx#Overview
 It's there because that's what Microsoft defined as feature level and I'm 
 assuming it's because they had a good need for it :)
Ah, that's why it doesn't make much sense :-).
I'm not sure what requirements got them to these levels. I definitely
think those 3 dx9 levels are very odd and don't even make sense for d3d
only, much less for gallium. For example, requires at least max aniso
16? You got to be kidding, aniso spec is so fuzzy you can pass any cheap
point filter as compliant (well almost) anyway, so it doesn't make any
sense (plus, this only really enhances filtering quality, it makes
absolutely zero difference for writing applications).
I think the retrofit of 9_1, 9_2, 9_3 to some arbitrary DX9 versions
doesn't match hardware really neither. The most distinguishable feature
of DX9.0c (which was the last version IIRC) was definitely SM 3.0, but
of course like everything else (multiple render targets, etc.) it was
optional. I think for gallium it would make way more sense to expose
only 2 feature levels - basically drop 9_1, and additionally bump 9_3 to
include SM 3.0 (I wonder if that's not just a typo there, after all the
model is called ps_4_0_level_9_3 unlike the others which are called
9_1 only).
Though granted nv20/25 can't do separate alpha blend (but it can't do
fragment shaders neither really so I don't know how well that driver is
ever going to work), i915 may not be able to do occlusion queries (not
sure if hw can't do it but the current driver doesn't implement it),
everybody (I think) can do mirror_once, and I don't know what
overlapping vertex elements are.

 
 More comments below.

 +static const enum pipe_feature_level
 +i915_feature_level(struct pipe_screen *screen)
 +{
 +   return PIPE_FEATURE_LEVEL_1;
 +}
 What's the reason this is not feature level 2?
 
 Yea, I was winging it for all the drivers because I couldn't be bothered to 
 do 
 a cross

[Mesa3d-dev] gallium-noconstbuf merge

Hi,

I'll plan to merge gallium-noconstbuf today. It's a pretty simple API
change, so things should continue to run :-).

Roland

--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] gallium-noconstbuf merge

On 11.01.2010 16:42, Keith Whitwell wrote:
 On Mon, 2010-01-11 at 07:33 -0800, Roland Scheidegger wrote:
 Hi,

 I'll plan to merge gallium-noconstbuf today. It's a pretty simple API
 change, so things should continue to run :-).
 
 Roland,
 
 Before you do this, can you make sure that the set_constant_buffer()
 entrypoint is properly documented in gallium/docs?

I was planning to do that after the merge, since the branch is too old
to include docs, so I'd have to merge from master just for that if I'd
do it before the merge.

Roland

--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] gallium-noconstbuf merge

On 11.01.2010 16:53, Keith Whitwell wrote:
 On Mon, 2010-01-11 at 07:50 -0800, Roland Scheidegger wrote:
 On 11.01.2010 16:42, Keith Whitwell wrote:
 On Mon, 2010-01-11 at 07:33 -0800, Roland Scheidegger wrote:
 Hi,

 I'll plan to merge gallium-noconstbuf today. It's a pretty simple API
 change, so things should continue to run :-).
 Roland,

 Before you do this, can you make sure that the set_constant_buffer()
 entrypoint is properly documented in gallium/docs?
 I was planning to do that after the merge, since the branch is too old
 to include docs, so I'd have to merge from master just for that if I'd
 do it before the merge.
 
 OK -- that's a decent excuse...  Can you post a first draft of the docs
 here before merging?

Ok here's a first stab. Actually I'm not sure how documentation should
look like, there are no other functions really commented yet. Should
these include function parameters / return values?
Also I'll need to work on the syntax a bit I know...

Roland
diff --git a/src/gallium/docs/source/context.rst b/src/gallium/docs/source/context.rst
index 21f5f91..a6fe408 100644
--- a/src/gallium/docs/source/context.rst
+++ b/src/gallium/docs/source/context.rst
@@ -34,6 +34,12 @@ buffers, surfaces) are bound to the driver.
 
 
 * ``set_constant_buffer``
+void (*set_constant_buffer)( struct pipe_context *,
+ uint shader, uint index,
+ struct pipe_constant_buffer *buf );
+sets a constant buffer to be used in a given shader type. index is
+used to indicate which buffer to set (note that some apis allow multiple ones
+to be set, though drivers are mostly restricted to the first one right now).
 * ``set_framebuffer_state``
 * ``set_fragment_sampler_textures``
 * ``set_vertex_sampler_textures``
--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev ___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] Gallium feature levels

On 11.01.2010 18:49, Zack Rusin wrote:
 Hey,
 
 knowing that we're starting to have serious issues with figuring out what 
 features the given device supports and what api's/extensions can be 
 reasonably 
 implemented on top of it I've spent the weekend trying to define feature 
 levels. Feature levels were effectively defined by the Direct3D version 
 numbers. 
 Attached is a patch and documentation for the feature levels. I'm also 
 attaching gallium_feature_levels.rst file which documents what each feature 
 level means and what apis can be reasonably supported by each (I figured it's 
 going to be easier to look at it outside the diff).
 
 There's a few features that are a bit problematic, in no particular order:
 - unnormalized coordinates, we don't even have a cap for those right now but 
 since that feature doesn't exist in direct3d (all coords are always 
 normalized 
 in d3d) the support for it is hard to define in term of a feature level
 - two-sided stencil - d3d supports it in d3d10 but tons of hardware supported 
 it earlier
 - extra mirror wrap modes - i don't think mirror repeat was ever supported 
 and 
 mirror clamp was removed in d3d10 but it seems that some hardware kept 
 support 
 for those
Mirror repeat is a core feature in GL since 1.4 hence we can't just drop
it. I think all hardware we'd ever care about would support it. mirror
clamp / mirror clamp to edge are only an extension, though
(ATI_texture_mirror_once). (I think the dx mirror once definition is
probably mirror_clamp_to_edge in opengl parlance).


 - shadow maps - it's more of an researched guess since it's largely based 
 on 
 a format support, but as far as i can tell all d3d10 hardware supports it, 
 earlier it varies (e.g. nvidia did it for ages)
Required for GL 1.4. I thought it was pretty much required for d3d
sm2.0, though you're right you could probably just not support the
texture format there. Anyway, most hardware should support it, I believe
even those which didn't really supported it at DX9 SM 2.0 time supported
it (chips like radeon r300 lacked the hw to do the comparison in the
texture unit, but it can be more or less easily implemented in the pixel
shader, though the implementation will suck as it certainly won't do PCF
just use some point sampling version - unless you're willing to do a
much more complex implementation in the pixel shader, but then on this
generation of hardware you might exceed maximum shader length). I
believe all hardware supporting SM 2.0 could at least do some sampling
of depth textures, though possibly only 16 bit and I'm not sure
filtering worked in all cases.


 
 I think the other stuff is acceptable. Take a look at the docs and let me 
 know 
 what you think.

What is feature level 1 useful for? I thought we'd really wanted DX9
level functionality as a bare minimum. GL2.x certainly needs cards
supporting shader model 2 (and that is a cheat, in reality it would be
shader model 3).

Also, I don't quite get the shader model levels. I thought there were
mainly two different DX9 versions, one with sm 2.0 the other with 3.0,
with noone caring about other differences (as most stuff was cappable
anyway). However, you've got 3 and all of them have 2.0 shader model?

More comments below.


 +static const enum pipe_feature_level
 +i915_feature_level(struct pipe_screen *screen)
 +{
 +   return PIPE_FEATURE_LEVEL_1;
 +}
What's the reason this is not feature level 2?


 +static const enum pipe_feature_level
 +nv30_screen_feature_level(struct pipe_screen *screen)
 +{
 +   return PIPE_FEATURE_LEVEL_1;
 +}
 +
Hmm in theory this should be feature level 2. Maybe the driver doesn't
quite cut it though...

 +static const enum pipe_feature_level r300_feature_level(
 +   struct pipe_screen* pscreen)
 +{
 +   if (r300screen-caps-is_r500) {
 +  return PIPE_FEATURE_LEVEL_2;
 +   } else {
 +  return PIPE_FEATURE_LEVEL_1;
 +   }
 +}
Shouldn't one be feature level 3 (or maybe 4?) the other 2?



 
 
 Profile 7 (2009)6 (2008)5 (2006)  
   4 (2004)3 (2003)2 (2002) 1 (2000)
 
 API Support DX11DX10.1  
 DX10/GL3.2  DX9.2   DX9.1   DX9.0DX7.0
 GL4.0   GL3.2+  GL3.2 
   GL3.0   GL2.x   GL2.xGL2.x
 VG  VG  VG
   VG  VG  VG   VG
 CL1.0   CL1.0   CL1.0
 
 Shader Model  5.0 4.x 4.0 
 2.0 2.0 2.0  1.0
   
   4_0_level_9_3   4_0_level_9_1   4_0_level_9_1
 
 Fragment Shader

Re: [Mesa3d-dev] RFC: gallium changes for conditional rendering

2010-01-04 Thread Roland Scheidegger

On 04.01.2010 15:48, Brian Paul wrote:
 Keith Whitwell wrote:
 On Thu, 2009-12-31 at 15:57 -0800, Brian Paul wrote:
 The BY_REGION modes indicate that it's OK for the GPU to discard the
 fragments in the region(s) which failed the occlusion test (perhaps
 skipping other per-fragment ops that would have otherwise occurred).
 See the spec at
 http://www.opengl.org/registry/specs/NV/conditional_render.txt for
 details.

 I'd be happy to omit those modes for now.  But since they're in the NV
 spec, I suspect NVIDIA hardware (at least) can make use of them.

 Brian,

 Lets leave them in - I'm presuming the no-op implementation which maps
 them down to the regular tokens is fine. 
 
 Yes.
 
 Incidentally, it would be fairly easy to take advantage of the 
 BY_REGION modes in the llvm driver.  If the number of samples passed 
 in a tile during occlusion testing is zero, the tile can be skipped 
 entirely when doing the conditional render.
 
 I'll check in these changes later today.

I think the main benefit for the by-region modes might have been saving
the vertex processing for the second GPU, but it's nice that these modes
seem useful for other cases as well.
(Remember for split-frame SLI, there will be two hardware occlusion
query results, one for each gpu, and by-region modes will make it
possible to run the rendering commands only on one when using
conditional render).

Roland

--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] [PATCH] [RFC] Remove PIPE_TEX_FILTER_ANISO to properly implement GL_EXT_texture_filter_anisotropic

2010-01-04 Thread Roland Scheidegger

On 01.01.2010 23:32, Luca Barbieri wrote:
 Currently Gallium defines a specific filtering mode for anisotropic
 filtering.
 
 This however prevents proper implementation of
 GL_EXT_texture_filter_anisotropic.
 
 The spec (written by nVidia) contains the following text:  A
 texture's maximum degree of anisotropy is specified independent from
 the texture's minification and magnification filter (as opposed to
 being supported as an entirely new filtering mode). Implementations
 are free to use the specified minification and magnification filter
 to select a particular anisotropic texture filtering scheme.  For
 example, a NEAREST filter with a maximum degree of anisotropy of two
 could be treated as a 2-tap filter that accounts for the direction of
 anisotropy.  Implementations are also permitted to ignore the
 minification or magnification filter and implement the highest
 quality of anisotropic filtering possible.
 
 and
 
  Should there be a particular anisotropic texture filtering
 minification and magnification mode?
 
 RESOLUTION:  NO.  The maximum degree of anisotropy should control 
 when anisotropic texturing is used.  Making this orthogonal to the
 minification and magnification filtering modes allows these settings
 to influence the anisotropic scheme used.  Yes, such an anisotropic
 filtering scheme exists in hardware.
 
 Gallium does the opposite, and this prevents use of nearest
 anisotropic filtering which is supported in nVidia hardware and also
 introduces redundant state.
 
 This patch removes PIPE_TEX_FILTER_ANISO. Anisotropic filtering is
 enabled if and only if max_anisotropy  1.0. Values between 0.0 and
 1.0, inclusive, of max_anisotropy are to be considered equivalent,
 and meaning to turn off anisotropic filtering.
 
 This approach has the small drawback of eliminating the possibility
 of enabling anisotropic filter on either minification or
 magnification separately, which Radeon hardware seems to support, is
 currently support by Gallium but not exposed to OpenGL. If this is
 actually useful it could be handled by splitting max_anisotropy in
 two values and adding an appropriate OpenGL extension.
 
 How does Radeon anisotropic magnification differ from linear
 magnification?

Note that different 3d apis have different requirements - ideally we
should be able to choose some state which suits all of them.
In particular, d3d10/11 have a separate filter mode for aniso (which
applies to all of min/mag/mip filters at the same time).
d3d9 also has special aniso filter, but it can be set separately for min
and mag - apart from aniso d3d9 also has some more filters like 4-sample
tent/guassian, all of them with undefined results if used as mip filter.
max aniso values with d3d can be from (uint) 1 to 16, and I haven't seen
hardware yet which could use float values for that.

So it seems for conformant d3d9 (but not d3d10) implementation you'll
need to be able to enable aniso for min/mag separately.

Meanwhile, you said
 This however prevents proper implementation of
 GL_EXT_texture_filter_anisotropic.
This isn't quite true - you've quoted it yourself Implementations are
also permitted to ignore the minification or magnification filter and
implement the highest quality of anisotropic filtering possible.

I don't think it's terribly useful to being able to enable anisotropic
filtering with other min/mag filters, and d3d never allowed it hence
hardware support for this will likely be rare. I don't really have a
strong opinion though if we should allow this in the api or not, I guess
it might make some drivers (except nvidia ones) (plus d3d state
trackers...) a bit more complicated but it shouldn't be too bad, and
maybe there's actually one app out there which would use it - or maybe
it'll give better results for things like forced aniso.

Roland




--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] Yet more r300g fear and loathing...

2009-12-21 Thread Roland Scheidegger

On 21.12.2009 15:13, Henri Verbeet wrote:
 2009/12/21 Corbin Simpson mostawesomed...@gmail.com:
 So, yet another thing that r300 sucks balls at: NPOT textures. We've
 been talking it over on IRC, and here's the options.

 1) Don't do NPOT. Stop advertising PIPE_CAP_NPOT, refuse to accept
 non-NPOT dimensions on textures. This sucks because it means that we
 don't get GL 2.0, which means most apps (bless their non-compliant
 souls) will refuse to attempt GLSL, which means that there's really no
 point in continuing this driver.

 2) Don't do NPOT in the pipe, but do it in the state tracker instead,
 as needed. Write up the appropriate fallbacks, and then let ARB_npot
 be advertised by the state tracker regardless of whether PIPE_CAP_NPOT
 is set. Lots of typing, though. Lots and lots of typing.

 3) Same as above, but put all the fallbacks in the pipe instead of the
 state tracker. I am *really* not fond of this, since PIPE_CAP was not
 intended for lies, but it was mentioned in IRC, so I gotta mention it
 here.

 3) The fglrx special: Don't require ARB_npot for advertising GL 2.0. I
 figured this wasn't on the table, but you never know...

 This is not really about where to implement the fallbacks, but as far
 as Wine is concerned, we'd mostly care about not triggering those if
 we can avoid them, e.g. by restrictions on clamping and filtering. We
 don't care much if GL 2.0 is supported or not, so a hypothetical
 MESA_conditional_npot extension would work for us. Other
 applications might care though, in which case an extension that allows
 us to query what situations trigger fallbacks would work for us as
 well.
 
 The fglrx solution mostly just sucks, for an important part because
 there's (afaik) no real documentation on what the restrictions are,
 and the reported extensions are now inconsistent with the reported GL
 version. That said, Wine has code to handle this case now, and I
 imagine other applications do as well.
This is a very common hardware problem, there's lots of hardware out
there which can do some (like r300) or even all glsl shaders but lack
true npot support. I suspect there might be a few apps which try to see
if ARB_texture_npot is supported, and if not, they'll assume that
functionality isn't supported even if the driver says GL 2.0. There's
certainly precedent for not announcing extensions even if you have to
support it for a given gl version, one prominent example would be the
nvidia GF3/4 cards which were GL 1.4 but couldn't do blend_func_separate
- they didn't announce support for EXT_blend_func_separate and just used
software fallback when they needed. So of course just not announcing
support for it isn't sufficient you still need to implement this somehow
(unless you just want to misrender...) but it might give apps a hint,
even though the API wasn't really designed for this. Sounds like it'll
just pollute things though. Last time I checked the extension mechanism
in gallium when used with dri state tracker was broken though and needed
some work anyway (because dri_init_extensions was called after
st_create_context, and the former just enables lots of extensions
regardless any cap bits, hence the extension string will have lots of
extensions which might not be supported).

Anyway, doing this in a utility module sounds good, though I'm not sure
what exactly you want to do. You could certainly fix up all texture
lookups in the shader by doing address calculations manually and so on,
but that gets a bit complicated quite soon I guess (in case of r300 it
probably also increases chances a shader won't fit onto hardware a lot).
Maybe misrendering things would still be an option, I think it would
mostly be clamp modes which wouldn't work correctly, since AFAIK you
could make even mipmaps work (on r300 at least).

Roland

--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] Yet more r300g fear and loathing...

2009-12-21 Thread Roland Scheidegger

The draw module approach can only work if the texcoords are used
directly for texture lookups, not for calculated coords (it should be
possible to detect these cases though).

Roland

On 21.12.2009 19:32, Keith Whitwell wrote:
 Faking those wrap modes is something that could be done either in the
 draw module (by decomposing triangles and adjusting the texcoords) or
 in the pixel shader (by adding logic to adjust the texcoord on a
 per-pixel basis).  Probably the draw-module approach is the easiest
 to implement and is appropriate for an infrequently used path - you
 still get hardware rasterization speeds, just a more expensive vertex
 path.
 
 Keith  From: Alex Deucher
 [alexdeuc...@gmail.com] Sent: Monday, December 21, 2009 10:18 AM To:
 tom fogal Cc: Mesa3D-Development Subject: Re: [Mesa3d-dev] Yet more
 r300g fear and loathing...
 
 I work on real-time visualization apps; the one in particular I'm 
 thinking of does texture sampling of potentially-NPOT textures via 
 GLSL.  If sampling a NPOT texture is not going to run in hardware, 
 the app is useless.  Further, our app keeps track of the amount of
 GL memory allocated for textures, FBOs and the like.  If a texture
 is going to be silently extended, that messes with our management
 routines [1].
 
 
 The hardware supports rectangular texture sampling.  What's missing
 is support for certain wrap modes and mipmaps with npot textures. 
 Neither of which are used that often.
 
 --
  This SF.Net email is sponsored by the Verizon Developer Community 
 Take advantage of Verizon's best-in-class app development support A
 streamlined, 14 day to market process makes app distribution fast and
 easy Join now and get one step closer to millions of Verizon
 customers http://p.sf.net/sfu/verizon-dev2dev 
 ___ Mesa3d-dev mailing
 list Mesa3d-dev@lists.sourceforge.net 
 https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

[Mesa3d-dev] gallium-edgeflags branch

2009-12-18 Thread Roland Scheidegger

Hello,

I plan to merge gallium-edgeflags branch soon.
I should have fixed up drivers syntactically, but note some will break
if applications use edgeflags. In particular the drivers which so far
have chosen to ignore edgeflags completely and don't have implemented a
fall back to use the draw module might break (I'm looking at you, r300
and nv30!...).
If those drivers want to continue to just have broken edgeflags support
but you just don't want them to crash, you'll need to fix them up so
they map the edgeflag output of the vertex shader to something halfway
meaningful for the hw, like a unneeded temp or so. But really the right
solution is to fix them so they use the draw module for things they
can't handle, like svga and nv40 do (or, of course, make them handle
edgeflags properly in hardware, but that might be dx10-class hardware
only which truly can do it).
Drivers for hardware without a hw vertex unit shouldn't have any
problem, since draw will handle everything for them.
You can use progs/trivial/tri-edgeflag for instance to see what happens.

Roland


--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] [PATCH] Fix u_pack_color.h rgb pack/unpack functions

2009-12-15 Thread Roland Scheidegger

On 15.12.2009 14:14, michal wrote:
 Guys,
 
 Does the attached patch make sense to you?
 
 I replaced the incomplete switch-cases with calls to u_format_access 
 functions that are complete but are going to be a bit more expensive to 
 call. Since they are used not very often in mesa state tracker, I 
 thought it's a good compromise.

They are not only used in state trackers, but drivers for instance as
well. That said, it's probably not really a performance critical path.
Though I'm not sure it makes sense to keep these functions even around
if they'll just do a single function call. Also, I'm pretty sure your
usage of the union isn't strict aliasing compliant (as far as I can tell
you could just go back and remove that ugly union again), though it's
probably one of the cases gcc won't complain (and hopefully won't
miscompile).

Roland

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] [PATCH] Fix u_pack_color.h rgb pack/unpack functions

2009-12-15 Thread Roland Scheidegger

On 15.12.2009 18:02, michal wrote:
 Roland Scheidegger pisze:
 On 15.12.2009 14:14, michal wrote:
   
 Guys,

 Does the attached patch make sense to you?

 I replaced the incomplete switch-cases with calls to u_format_access 
 functions that are complete but are going to be a bit more expensive to 
 call. Since they are used not very often in mesa state tracker, I 
 thought it's a good compromise.
 
 They are not only used in state trackers, but drivers for instance as
 well. That said, it's probably not really a performance critical path.
 Though I'm not sure it makes sense to keep these functions even around
 if they'll just do a single function call. Also, I'm pretty sure your
 usage of the union isn't strict aliasing compliant (as far as I can tell
 you could just go back and remove that ugly union again), though it's
 probably one of the cases gcc won't complain (and hopefully won't
 miscompile).

   
 I am casting to (void *) and then u_format casts it back to whatever it 
 needs to. I think I am innocent.
Casts to void * and back to something are only safe if the something
is the same as it initially was. Well in theory anyway. That's also
where some of the initial warnings came from, callers using some pointer
to unsigned, which then in the end got cast to ubyte * or whatever. An
intermediate cast to void * doesn't change anything. That said, the
callers probably couldn't have handled the formats not returning the
right type anyway.
Often though gcc won't complain about aliasing if you use some void *
pointer in a function call and cast it to something else than it was, I
think it usually won't be able to figure out what the original type was,
hence it needs to assume it can alias with anything.

 
 Anyway, I will go after Keith's suggestion and fill in only the 
 switch-default case. We can always nuke the special cases later when/if 
 we realise the performance impact can be neglected.
Yes, sounds good.

Roland

--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] r300 driver help needed

2009-12-14 Thread Roland Scheidegger

On 14.12.2009 10:29, michael wang wrote:
 Dear Mesa developers,
 
 I am learning OpenGL on my notebook (with an old ATI Radeon X600 video
 card), but I cannot get GL_LINE_STIPPLE work. It draws solid line only.
 
 glxinfo shows I'm using the R300 driver, and some study of the source
 code I find it fallback (to software rendering I suppose) when I
 enable GL_LINE_STIPPLE.
 
 So my question is:
 1. How can I check why my software rendering does not do line stipple?
You could try it with software mesa and see if it works there.
Also, the fallback for r300 only happens if you don't have
disable_lowimpact_fallbacks set, so if this is set for whatever reason
you will indeed get a solid line. If you set RADEON_DEBUG=fall it should
print out a warning if it hits that line stipple fallback

 2. Is R300 project still active? Is so, where should I report this bug to?
The project is still alive, if it's a driver bug and not your app you
could file a bug at bugs.freedesktop.org.

Roland

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] glsl-pp-rework-2 branch merge

2009-12-11 Thread Roland Scheidegger

On 09.12.2009 18:58, michal wrote:
 Keith Whitwell pisze:
 On Wed, 2009-12-09 at 09:16 -0800, michal wrote:
   
 Hi all,

 I would like to merge this branch back to master this week. If anoyone 
 could test if the build works on his/her system, it would be nice.

 Thanks.
 
 Michal,

 Can you detail what testing you've done on this branch and which
 environments you have/haven't built on?


   
 Testing:
 
 * Capture the output of the old syntax parser and comapre with the 
 output of the new parser. No regressions found. Use a set of over 400 
 shaders to perform the comparison.
 
 * Run GLSL Parser Test to see if the new parser successfully intergrates 
 with the rest of Mesa. No regressions found.
 
 So far I have been building that with scons on windows. I am planning to 
 fix the build with make and scons on linux.
Seems to compile just fine now with make.
Too bad all the strict-aliasing violations are still there (in
grammar.c), I'll give this a look (but don't wait for it for merging).
Also, there seems to be some char/byte uncleanliness, I get a gazillion
warnings like:
shader/grammar/grammar.c: In function ‘get_spec’:
shader/grammar/grammar.c:1978: warning: pointer targets in passing
argument 1 of ‘strlen’ differ in signedness
/usr/include/string.h:397: note: expected ‘const char *’ but argument is
of type ‘byte *’
shader/grammar/grammar.c:1978: warning: pointer targets in passing
argument 1 of ‘__builtin_strcmp’ differ in signedness
shader/grammar/grammar.c:1978: note: expected ‘const char *’ but
argument is of type ‘byte *’
shader/grammar/grammar.c:1978: warning: pointer targets in passing
argument 1 of ‘strlen’ differ in signedness

Roland

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] glsl-pp-rework-2 branch merge

2009-12-09 Thread Roland Scheidegger

On 09.12.2009 18:16, michal wrote:
 Hi all,
 
 I would like to merge this branch back to master this week. If anoyone 
 could test if the build works on his/her system, it would be nice.
Good stuff!
Looks like only scons build system is working though.

Roland



--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] Branch pipe-format-simplify open for review

On 08.12.2009 15:55, michal wrote:
 This branch simplifies pipe/p_format.h by making enum pipe_format what 
 it should have been -- an enum.
 
 As a result there is no extra information encoded in it and one needs to 
 use auxiliary/util/u_format.h to get that info instead. Linking to the 
 auxiliary/util lib is necessary.
 
 Please review and if you can test if it doesn't break your setup, I will 
 appreciate it.
 
 I would like to hear from r300 and nouveau guys, as those drivers were 
 using some internal macros and I weren't 100% sure I got the conversion 
 right.

Looks nice, though it is unfortunately based on pre gallium-noblocks
merge, so I suspect you'll get a conflict for almost every patch chunk
at least in drivers if you try to merge it...

Roland

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] Branch pipe-format-simplify open for review

On 08.12.2009 16:49, michal wrote:
 Roland Scheidegger pisze:
 On 08.12.2009 15:55, michal wrote:
   
 This branch simplifies pipe/p_format.h by making enum pipe_format what 
 it should have been -- an enum.

 As a result there is no extra information encoded in it and one needs to 
 use auxiliary/util/u_format.h to get that info instead. Linking to the 
 auxiliary/util lib is necessary.

 Please review and if you can test if it doesn't break your setup, I will 
 appreciate it.

 I would like to hear from r300 and nouveau guys, as those drivers were 
 using some internal macros and I weren't 100% sure I got the conversion 
 right.
 
 Looks nice, though it is unfortunately based on pre gallium-noblocks
 merge, so I suspect you'll get a conflict for almost every patch chunk
 at least in drivers if you try to merge it...

   
 I didn't touch pipe blocks -- I left the pf_getblock* and friends in 
 pipe_format.h intact.
Yes, but you're bound to get lots of conflicts because you replaced for
instance pf_format_get_block with util_format_get_block whereas that
stuff is removed from master because pipe_format_block (and the
block/nblocksx/nblocksy variables in pipe_texture and pipe_transfer) are
gone completely.
I quickly tried a merge and there were conflicts in over 40 files - from
a quick glance though they should be trivial to resolve. And I don't
think there's too much hidden stuff which won't work any longer - just
let util_format_get_block die and it should probably work out ok.

 
 How severe is the gallium-noblocks change? I would like to avoid mergin 
 master into this branch.
It's not really that severe, it just touched a lot of the same places in
drivers this change does.
btw, I also avoided merging master to feature branch when I merged
gallium-noblocks, and instead fixed up conflicts on merge to master (and
adpated stuff which needed changes later). Is there some policy for this?

Roland

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] gallium-strict-aliasing branch merge

Keith,

I think there might be some slight issue with some of the changes in the
drivers I did. In particular, I was under the impression it would be ok
to do something like
union a_union {
  int i;
  double d;
};
int f() {
   double d = 3.0;
   return ((union a_union *) d)-i;
};
but in fact gcc manpage tells me it's not (the example is from gcc 4.4
manpage) - this site told me this is ok, casting through a union (2)
http://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-aliasing.html,
I guess it was considered ok in 2006 but not now (though I'm not sure
why not)... I did that in some places because otherwise there's no way
around assigning the value to the union and pass that around instead.
Curiously though, despite the gcc manpage saying the code might not be
ok, gcc doesn't warn about it in the places I used it.
Anyway, I'm not sure it's worth bothering with this now, as drivers
could be fixed up without any interface changes.

Roland


unsigned i =
On 08.12.2009 17:19, Keith Whitwell wrote:
 Roland,
 
 This looks OK to me, hopefully this will see us getting on top of strict
 aliasing issues after all these years...
 
 Keith
 
 On Mon, 2009-12-07 at 18:14 -0800, Roland Scheidegger wrote:
 Hello,

 I'm planning to merge gallium-strict-aliasing branch soon, which will
 bring another gallium api change.
 pipe_reference function has different arguments, because the old version
 was pretty much not really useful for strict-aliasing compliant code
 (util_color_pack functions also gets an update for the same reason).
 The goal of course it to enable builds which do no longer need
 -fno-strict-aliasing. scons builds already didn't do this (which was a
 bug since the builds were indeed broken).
 I didn't check all drivers for strict-aliasing compliance, but for
 gallium everybody should make sure the code they are submitting is
 according to strict aliasing rules (*). One downside of compiling with
 -fno-strict-aliasing is also that you don't get the warnings wrt strict
 aliasing, so you might have missed that in the past.
 (There are no build system changes yet, there's still some strict
 aliasing violating code in shader/grammar which should get replaced soon
 anyway.)

 (*) Strictly speaking, it looks like c99 actually has undefined
 behaviour writing and reading different members of a union (wtf?), but
 this is considered acceptable here, and all compilers should support it.

 Roland

 --
 Return on Information:
 Google Enterprise Search pays you back
 Get the facts.
 http://p.sf.net/sfu/google-dev2dev
 ___
 Mesa3d-dev mailing list
 Mesa3d-dev@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
 


--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] gallium-strict-aliasing branch merge

On 08.12.2009 17:37, Keith Whitwell wrote:
 On Tue, 2009-12-08 at 08:31 -0800, Roland Scheidegger wrote:
 Keith,

 I think there might be some slight issue with some of the changes in the
 drivers I did. In particular, I was under the impression it would be ok
 to do something like
 union a_union {
   int i;
   double d;
 };
 int f() {
double d = 3.0;
return ((union a_union *) d)-i;
 };
 but in fact gcc manpage tells me it's not (the example is from gcc 4.4
 manpage) - this site told me this is ok, casting through a union (2)
 http://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-aliasing.html,
 I guess it was considered ok in 2006 but not now (though I'm not sure
 why not)... I did that in some places because otherwise there's no way
 around assigning the value to the union and pass that around instead.
 Curiously though, despite the gcc manpage saying the code might not be
 ok, gcc doesn't warn about it in the places I used it.
 Anyway, I'm not sure it's worth bothering with this now, as drivers
 could be fixed up without any interface changes.
 
 Is it a lot of extra work to fix?  I wouldn't mind getting on top of
 this once and for all.

Not in the places I touched. It'll just make the code uglier, though at
least the compiler might still optimize extra assignments away.
For example in st_atom_pixeltransfer.c it now looks like this:
util_pack_color_ub(r, g, b, a, pt-format, (union util_color *)(dest + k));
and I'd need to change it to:
union util_color uc;
util_pack_color_ub(r, g, b, a, pt-format, uc);
*(dest + k) = uc.ui;
Ok, not really a lot more ugly.
Will do this then, though there are other places where things like that
might already be used, and since the compiler does not issue any
warnings it might be a bit time consuming to find all of them.

Roland


--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] gallium-strict-aliasing branch merge

On 08.12.2009 18:12, Roland Scheidegger wrote:
 On 08.12.2009 17:37, Keith Whitwell wrote:
 On Tue, 2009-12-08 at 08:31 -0800, Roland Scheidegger wrote:
 Keith,

 I think there might be some slight issue with some of the changes in the
 drivers I did. In particular, I was under the impression it would be ok
 to do something like
 union a_union {
   int i;
   double d;
 };
 int f() {
double d = 3.0;
return ((union a_union *) d)-i;
 };
 but in fact gcc manpage tells me it's not (the example is from gcc 4.4
 manpage) - this site told me this is ok, casting through a union (2)
 http://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-aliasing.html,
 I guess it was considered ok in 2006 but not now (though I'm not sure
 why not)... I did that in some places because otherwise there's no way
 around assigning the value to the union and pass that around instead.
 Curiously though, despite the gcc manpage saying the code might not be
 ok, gcc doesn't warn about it in the places I used it.
 Anyway, I'm not sure it's worth bothering with this now, as drivers
 could be fixed up without any interface changes.
 Is it a lot of extra work to fix?  I wouldn't mind getting on top of
 this once and for all.
 
 Not in the places I touched. It'll just make the code uglier, though at
 least the compiler might still optimize extra assignments away.
 For example in st_atom_pixeltransfer.c it now looks like this:
 util_pack_color_ub(r, g, b, a, pt-format, (union util_color *)(dest + k));
 and I'd need to change it to:
 union util_color uc;
 util_pack_color_ub(r, g, b, a, pt-format, uc);
 *(dest + k) = uc.ui;
 Ok, not really a lot more ugly.
 Will do this then, though there are other places where things like that
 might already be used, and since the compiler does not issue any
 warnings it might be a bit time consuming to find all of them.

Ok, unfortunately code in vg_translate.c got a lot more verbose :-(.
Also, I think there's quite some usage of casting void * to other types.
That could also lead to strict-aliasing violations, as you're only
allowed to do casts back to the original type it had (hence the
strict-aliasing warnings if you do *((float *) (void *)
some-uint-value), because the compiler is able to determine original
type). Might be safe though as long as gcc doesn't do too much
interprocedural optimizations, and if it does it should probably be able
to at least output a warning, since in this case it should also be able
to determine the original type I guess...

Roland

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] gallium-strict-aliasing branch merge

On 08.12.2009 20:57, Martin Olsson wrote:
 Roland Scheidegger wrote:
 Keith,

 I think there might be some slight issue with some of the changes in the
 drivers I did. In particular, I was under the impression it would be ok
 to do something like
 union a_union {
   int i;
   double d;
 };
 int f() {
double d = 3.0;
return ((union a_union *) d)-i;
 };
 but in fact gcc manpage tells me it's not (the example is from gcc 4.4
 manpage) 
 
 I think the issue you are describing is explained here:
 http://patrakov.blogspot.com/2009/03/dont-use-old-dtoac.html
Yes, probably. Note though it says gcc generates warnings for it, which
didn't happen, so I think gcc would actually not miscompile it.
(I suspect gcc doesn't complain and does not miscompile as long as it
can't determine the original type of the value). Still, the explanation
is imho not really satisfactory. I think a lot of people used to think
this would be perfectly fine (see for instance
http://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-aliasing.html
casting through a union (2)).


 Also note the link he posts to the GCC manual:
 http://gcc.gnu.org/onlinedocs/gcc-4.3.2/gcc/Optimize-Options.html#index-fstrict_002daliasing-721
Yep, that's the same stuff I used for the example.

Roland

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

[Mesa3d-dev] gallium-strict-aliasing branch merge

2009-12-07 Thread Roland Scheidegger

Hello,

I'm planning to merge gallium-strict-aliasing branch soon, which will
bring another gallium api change.
pipe_reference function has different arguments, because the old version
was pretty much not really useful for strict-aliasing compliant code
(util_color_pack functions also gets an update for the same reason).
The goal of course it to enable builds which do no longer need
-fno-strict-aliasing. scons builds already didn't do this (which was a
bug since the builds were indeed broken).
I didn't check all drivers for strict-aliasing compliance, but for
gallium everybody should make sure the code they are submitting is
according to strict aliasing rules (*). One downside of compiling with
-fno-strict-aliasing is also that you don't get the warnings wrt strict
aliasing, so you might have missed that in the past.
(There are no build system changes yet, there's still some strict
aliasing violating code in shader/grammar which should get replaced soon
anyway.)

(*) Strictly speaking, it looks like c99 actually has undefined
behaviour writing and reading different members of a union (wtf?), but
this is considered acceptable here, and all compilers should support it.

Roland

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] [RFC] Move _mesa_memcpy to imports.h and inline it

2009-12-04 Thread Roland Scheidegger

On 04.12.2009 11:24, Kenneth Graunke wrote:
 On Thursday 03 December 2009 12:47:36 Brian Paul wrote:
 [snip]
 I've been meaning to go over imports.[ch] and make a bunch of the
 wrapper functions inlines.

 A lot of the wrappers aren't needed any more.  Back before valgrind I
 used the memory-related wrappers quite often.  For now, let's keep the
 wrappers so we don't have to touch tons of other files right away.

 Matt, feel free to submit a patch.

 -Brian
 
 I've attached patches to remove a number of the wrappers, should you decide 
 you want to go that way.
 

 diff --git a/src/mesa/main/imports.c b/src/mesa/main/imports.c
 index 6a34aec..0f10111 100644
 --- a/src/mesa/main/imports.c
 +++ b/src/mesa/main/imports.c
 @@ -268,17 +268,6 @@ _mesa_bzero( void *dst, size_t n )
  #endif
  }
  
 -/** Wrapper around memcmp() */
 -int
 -_mesa_memcmp( const void *s1, const void *s2, size_t n )
 -{
 -#if defined(SUNOS4)
 -   return memcmp( (char *) s1, (char *) s2, (int) n );
 -#else
 -   return memcmp(s1, s2, n);
 -#endif
 -}
 -
  /*...@}*/

So is the different implementation on SUNOS4 no longer relevant?

Roland


--
Join us December 9, 2009 for the Red Hat Virtual Experience,
a free event focused on virtualization and cloud computing. 
Attend in-depth sessions from your desk. Your couch. Anywhere.
http://p.sf.net/sfu/redhat-sfdev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] mesa/gallium strict aliasing bugs

On 03.12.2009 01:38, Jose Fonseca wrote:
 Interesting. Yes we want to fix the problem, as we're missing out
 potential optimizations.
 
 For fixing reference counting, couldn't we fix it by doing the final
 
 
 *pdst = src;
 
 in each pipe_xxx_reference function in the bottom of p_state.h, and
 pass only (*pdst)-reference, src-reference to the p_refcnt.h's
 pipe_reference() instead (i.e., just pointers, and no pointer to
 pointers)? I haven't tested it but it seems that that would eliminate
 all casts, hence should be correct.

That is pretty much what I did (except I used a temporary pointer to
pass its address to pipe_reference so I didn't have to change the
pipe_reference function at all, so it still worked save the aliasing
issues for other callers).
Some callers use this directly, I can fix them I guess it's just a bit
less convenient function, if this is the approach we'll take.

Roland

 
 Jose
 
  From: Roland Scheidegger
 [srol...@vmware.com] Sent: Wednesday, December 02, 2009 23:19 To:
 Jose Fonseca Cc: mesa3d-...@lists.sf.net Subject: Re: [Mesa3d-dev]
 mesa/gallium strict aliasing bugs
 
 On 02.12.2009 18:33, José Fonseca wrote:
 I've seen strict aliasing assumption causing bugs in other gallium 
 components. It seems endemic to our code.  Unless we actively
 decidee to go and chase the strict aliasing bugs now we should add 
 -fno-strict-aliasing to all our builds.
 
 Do we ever want to fix strict aliasing? If we do, I think the problem
  with refcounting is pretty fundamental (I traced the crash to
 aliasing problems there, and hacked up some bogus version which
 didn't segfault for the testcase I used). At least I can't see a way
 to make this really work in some nice way. Supposedly gcc supports 
 __attribute__((__may_alias__)) but I had no luck with it. In
 gallium (not core mesa) there's only one other offender causing a 
 large amount of warnings, that is util_pack_color, and I think it
 won't actually cause problems.


--
Join us December 9, 2009 for the Red Hat Virtual Experience,
a free event focused on virtualization and cloud computing. 
Attend in-depth sessions from your desk. Your couch. Anywhere.
http://p.sf.net/sfu/redhat-sfdev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] mesa/gallium strict aliasing bugs

On 03.12.2009 11:17, Keith Whitwell wrote:
 On Wed, 2009-12-02 at 12:46 -0800, Roland Scheidegger wrote:
 On 02.12.2009 18:33, José Fonseca wrote:
 I've seen strict aliasing assumption causing bugs in other gallium
 components. It seems endemic to our code.  Unless we actively decidee to
 go and chase the strict aliasing bugs now we should add
 -fno-strict-aliasing to all our builds.
 Hmm, actually some of them (in mesa at least) seem to be really
 unnecessary. Take the COPY_4FV macro for instance. I replaced that in a
 simple test program (attached) to either just do direct assignment
 without cast, or use memcpy instead.
 
 That comment was probably true in 1999 -- but possibly not any longer...
 
 The results are actually interesting, the comment says cast is done to
 avoid going through fp registers, but looking in the assembly (at least
 with optimization) that doesn't happen anyway, and the generated code is
 actually nearly identical, but in fact it not only triggers
 strict-aliasing warnings but doesn't work correctly (when compiled with
 -O3 or similar parameters invoking -fstrict-aliasing).
 
 ...
 
 Doesn't use 128bit sse moves but looks like an improvement... When using
 no optimization code certainly gets much less readable and the memcpy
 version will call glibc memcpy (which itself will still be optimized
 hence probably faster despite the function call).
 So I'll kill at least this one and just use _mesa_memcpy there, unless
 there are good reasons not to. I think pretty much all compilers should
 have builtin memcpy optimizations.
 
 I didn't realize COPY_4FV and friends were related to our strict
 aliasing problems -- if that's the case, let's kill or reimplement them
 straight away.
Actually, that was the simplest one, and most other of these macros
don't do this. There's also plenty of warnings in the shader/grammar
code, apart from that there's actually not that many warnings, at least
when not compiling legacy drivers... So I guess getting rid of
strict-aliasing issues is doable for gallium.

Roland

--
Join us December 9, 2009 for the Red Hat Virtual Experience,
a free event focused on virtualization and cloud computing. 
Attend in-depth sessions from your desk. Your couch. Anywhere.
http://p.sf.net/sfu/redhat-sfdev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] gallium-noblocks branch merge

On 03.12.2009 20:55, Christoph Bumiller wrote:
 Roland Scheidegger schrieb:
 Hi,

 I'm planning to merge gallium-noblocks branch to master soon. This api
 change may affect your driver, statetracker, whatever. I _should_ have
 fixed up all in tree stuff using it, but that's not a guarantee it will
 still run correctly (nv50 driver was strange for instance), and
 What's strange with nv50 ?
 There's this one if (!pt-nblocksx[level]) { in nv50_transfer.c that
 was an unnecessary leftover because I hadn't seen miptree_blanket forgot
 the initialize these and pushed a bit too early, thankfully this is now
 gone automatically.
Ok, this is mostly what was strange. That and it was the driver which by
far needed the most changes :-).

 
 I just need the y blocks everywhere instead of just y because things
 like offset = stride * y is simply wrong if you have *actual*
 multi-pixel blocks (pitch as in nblocksx * width).
Yes, drivers are encouraged to use the block helpers. This way they
don't need to special case any formats, as it should work for
uncompressed, dxt, or things like ycbcr just the same.

 I hope no one will try to transfer just parts of a block (makes not much
 sense for DXT imo though).
Yes, this shouldn't happen. Neither ogl nor dx should trigger this (it's
not allowed for CompressedTexSubImage), so transfers are required to
only happen along block boundaries.

Roland

--
Join us December 9, 2009 for the Red Hat Virtual Experience,
a free event focused on virtualization and cloud computing. 
Attend in-depth sessions from your desk. Your couch. Anywhere.
http://p.sf.net/sfu/redhat-sfdev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] [RFC] Move _mesa_memcpy to imports.h and inline it

On 03.12.2009 19:46, Matt Turner wrote:
 Most of the functions in imports.c are very small, to the function
 call overhead is large relative to their size. Can't we do something
 like in the attached patch and move them to imports.h and mark them
 static inline? Things like memcpy and memset are often optimized out
 by the compiler, and by hiding them in these wrapper functions, we're
 probably losing any benefits we'd otherwise see.
++ from me, at least for the very simple wrappers. _mesa_memcpy
especially I think can be very nicely used for array assignments and the
like, and in case of (very) small amounts of data to copy call overhead
might be significant.

 Similarly, if we're
 going to use a magic sqrtf algorithm (apparently for speed) then
 shouldn't we also let the compiler properly inline the function?
Not sure here, the function is still quite complex, I don't think call
overhead will make any difference. I've looked at the code though when
it wasn't using the fast path (with -O3 but DEBUG - why is this different?)
This version though adds a lot of overhead:
- call overhead for _mesa_sqrtf
- overhead converting to double
- overhead converting back
In the generated code the actual sqrtf code was a single assembly
instruction (sqrtsd %xmm0, %xmm0) - granted that's SSE2 only, and it
requires quite a few cycles. Still, I guess the overhead is significant,
not to mention that if we'd just use a float instead of double not only
we wouldn't have to convert the type but the compiler would actually
issue sqrtss %xmm0 %xmm0 instead, which is (depending on the cpu) twice
as fast. Not sure why we use double there, are there platforms where
sqrtf (float x) isn't supported?
So really, call overhead is a tiny fraction of the optimization
potential for this function. When not using DEBUG (and USE_IEEE is
defined) the function is still quite a few cycles, so call overhead
doesn't look that bad neither. I don't actually know which version is
faster (or more accurate - I think though sqrtss is actually fully
accurate). Of course using sqrtf(x) will only be fast if the cpu
supports some kind of fast float unit (and the compiler knows how to use
it).
If you'd want to do some more optimization, there's for instance
_mesa_inv_sqrtf - it is supposedly fast, but sse2 offers rsqrtss, which
is really fast. However, I remember we got some bugs some time ago when
gcc actually used that, because precision wasn't enough - it will do
this if you enable -funsafe-math-optimizations, -mrecip or similar. I've
just seen though actually that at least gcc 4.4 does an additional
newton-raphson step when you do 1.0/sqrtf(float x) (so it will issue
rsqrtss plus a couple muls and adds), which might still be less or even
more accurate, and almost certainly be faster than the manual version.
So there's probably far more optimization potential than the call
overhead. Most of those functions are probably never used in any
performance critical path anyway.


 
 I also don't quite understand wrapper functions like
 double
 _mesa_pow(double x, double y)
 {
return pow(x, y);
 }
 
 Maybe at one time these had #ifdefs in them like _mesa_memcpy, but I
 can't see any reason not to remove it now.
 
 Someone enlighten me.
I guess there might have been indeed #ifdefs in the past. In any case,
using wrapper would make it easier to implement such optimizations in
the future if anyone wants to, not that this is something which you
probably want to do (that stuff is probably better left up to the
compiler). So, at least if they are inlined, they shouldn't really hurt
neither.


--
Join us December 9, 2009 for the Red Hat Virtual Experience,
a free event focused on virtualization and cloud computing. 
Attend in-depth sessions from your desk. Your couch. Anywhere.
http://p.sf.net/sfu/redhat-sfdev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

[Mesa3d-dev] mesa/gallium strict aliasing bugs

Hi,

I've come across some bug (which I thought might be related to the
gallium-noblocks branch, but it's not) which caused a segfault but only
when not using debug builds. I think this is the same issue Vinson was
seeing some time ago. Looks like a impossible backtrace:

#0  st_texture_image_copy (pipe=0x612640, dst=0x0,
dstLevel=value optimized out, src=0x6e1dd0, face=0)
at src/mesa/state_tracker/st_texture.c:306
#1  0x7759b383 in copy_image_data_to_texture (
ctx=value optimized out, pipe=value optimized out, tObj=0x6919d0,
needFlush=value optimized out)
at src/mesa/state_tracker/st_cb_texture.c:1673
#2  st_finalize_texture (ctx=value optimized out,
pipe=value optimized out, tObj=0x6919d0, needFlush=value
optimized out)
at src/mesa/state_tracker/st_cb_texture.c:1807
#3  0x7758fd9d in finalize_textures (st=0x68a9c0)
at src/mesa/state_tracker/st_atom_texture.c:144

Segfault seems to be because dst is 0x0, but if you look at the call
stack it is easy to see this is impossible.
That would point to gcc optimizer issue (using gcc 4.4.1), except there
are quite a few warnings during compile, especially about violating
strict-aliasing rules...
So, in the gallium.py scons file there's actually this:
if debug:
ccflags += ['-O0', '-g3']
elif env['CCVERSION'].startswith('4.2.'):
# gcc 4.2.x optimizer is broken
print warning: gcc 4.2.x optimizer is broken -- disabling
optimizations
ccflags += ['-O0', '-g3']
else:
ccflags += ['-O3', '-g3']


So I added -fno-strict-aliasing and indeed, segfault is gone. Hence I
believe this is incorrectly accusing the gcc 4.2 optimizer, whereas it's
actually a code bug, and certainly it is not restricted to gcc 4.2
(unless this addressed a different problem).
Not quite sure though why the code violates strict-aliasing rules though
in all places - about half of the warnings are from pipe_reference,
(p_refcount.h:85). Not sure if all warnings are actually real issues,
and not sure how this should be fixed (should we try to fix this for
real or just force -fno-strict-aliasing).

Roland

--
Join us December 9, 2009 for the Red Hat Virtual Experience,
a free event focused on virtualization and cloud computing. 
Attend in-depth sessions from your desk. Your couch. Anywhere.
http://p.sf.net/sfu/redhat-sfdev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] mesa/gallium strict aliasing bugs

On 02.12.2009 18:33, José Fonseca wrote:
 On Wed, 2009-12-02 at 09:05 -0800, Roland Scheidegger wrote:
 Hi,

 I've come across some bug (which I thought might be related to the
 gallium-noblocks branch, but it's not) which caused a segfault but only
 when not using debug builds. I think this is the same issue Vinson was
 seeing some time ago. Looks like a impossible backtrace:

 #0  st_texture_image_copy (pipe=0x612640, dst=0x0,
 dstLevel=value optimized out, src=0x6e1dd0, face=0)
 at src/mesa/state_tracker/st_texture.c:306
 #1  0x7759b383 in copy_image_data_to_texture (
 ctx=value optimized out, pipe=value optimized out, tObj=0x6919d0,
 needFlush=value optimized out)
 at src/mesa/state_tracker/st_cb_texture.c:1673
 #2  st_finalize_texture (ctx=value optimized out,
 pipe=value optimized out, tObj=0x6919d0, needFlush=value
 optimized out)
 at src/mesa/state_tracker/st_cb_texture.c:1807
 #3  0x7758fd9d in finalize_textures (st=0x68a9c0)
 at src/mesa/state_tracker/st_atom_texture.c:144

 Segfault seems to be because dst is 0x0, but if you look at the call
 stack it is easy to see this is impossible.
 That would point to gcc optimizer issue (using gcc 4.4.1), except there
 are quite a few warnings during compile, especially about violating
 strict-aliasing rules...
 So, in the gallium.py scons file there's actually this:
 if debug:
 ccflags += ['-O0', '-g3']
 elif env['CCVERSION'].startswith('4.2.'):
 # gcc 4.2.x optimizer is broken
 print warning: gcc 4.2.x optimizer is broken -- disabling
 optimizations
 ccflags += ['-O0', '-g3']
 else:
 ccflags += ['-O3', '-g3']


 So I added -fno-strict-aliasing and indeed, segfault is gone. Hence I
 believe this is incorrectly accusing the gcc 4.2 optimizer, whereas it's
 actually a code bug, and certainly it is not restricted to gcc 4.2
 (unless this addressed a different problem).
 
 It addressed a different problem. Type 
 
   git show bb8f3090ba37aa3f24943fdb43c4120776289658
 
 to see explanation of it.
Ok.

 
 Not quite sure though why the code violates strict-aliasing rules though
 in all places - about half of the warnings are from pipe_reference,
 (p_refcount.h:85). Not sure if all warnings are actually real issues,
 and not sure how this should be fixed (should we try to fix this for
 real or just force -fno-strict-aliasing).
 
 I read (forgot where) that gcc strict aliasing warnings don't catch all
 cases.
gcc man page states this (-Wstrict-aliasing=n). Says though (gcc 4.4.1)
with n=3 (default) there should be very few false positives and few
false negatives.

 
 I've seen strict aliasing assumption causing bugs in other gallium
 components. It seems endemic to our code.  Unless we actively decidee to
 go and chase the strict aliasing bugs now we should add
 -fno-strict-aliasing to all our builds.
ok. I guess though there's no guarantee it won't break other compilers
where we don't have set any flags for this.

Roland


--
Join us December 9, 2009 for the Red Hat Virtual Experience,
a free event focused on virtualization and cloud computing. 
Attend in-depth sessions from your desk. Your couch. Anywhere.
http://p.sf.net/sfu/redhat-sfdev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] mesa/gallium strict aliasing bugs

On 02.12.2009 18:33, José Fonseca wrote:
 I've seen strict aliasing assumption causing bugs in other gallium
 components. It seems endemic to our code.  Unless we actively decidee to
 go and chase the strict aliasing bugs now we should add
 -fno-strict-aliasing to all our builds.

Hmm, actually some of them (in mesa at least) seem to be really
unnecessary. Take the COPY_4FV macro for instance. I replaced that in a
simple test program (attached) to either just do direct assignment
without cast, or use memcpy instead.
The results are actually interesting, the comment says cast is done to
avoid going through fp registers, but looking in the assembly (at least
with optimization) that doesn't happen anyway, and the generated code is
actually nearly identical, but in fact it not only triggers
strict-aliasing warnings but doesn't work correctly (when compiled with
-O3 or similar parameters invoking -fstrict-aliasing).

assign_cast:
.LFB45:
.cfi_startproc
movl(%rsi), %edx
leaq4(%rdi), %rax
movl%edx, 4(%rdi)
movl4(%rsi), %edx
movl%edx, 4(%rax)
movl8(%rsi), %edx
movl%edx, 8(%rax)
movl12(%rsi), %edx
movl%edx, 12(%rax)
ret
.cfi_endproc

assign:
.LFB46:
.cfi_startproc
movl(%rsi), %eax
movl%eax, 4(%rdi)
movl4(%rsi), %eax
movl%eax, 8(%rdi)
movl8(%rsi), %eax
movl%eax, 12(%rdi)
movl12(%rsi), %eax
movl%eax, 16(%rdi)
ret
.cfi_endproc


But clearly using memcpy the compiler does a better job:
assign_cpy:
.LFB44:
.cfi_startproc
movq(%rsi), %rax
movq%rax, 4(%rdi)
movq8(%rsi), %rax
movq%rax, 12(%rdi)
ret
.cfi_endproc
.LFE44:

Doesn't use 128bit sse moves but looks like an improvement... When using
no optimization code certainly gets much less readable and the memcpy
version will call glibc memcpy (which itself will still be optimized
hence probably faster despite the function call).
So I'll kill at least this one and just use _mesa_memcpy there, unless
there are good reasons not to. I think pretty much all compilers should
have builtin memcpy optimizations.

Roland
#include string.h
#include stdio.h

#define COPY_4FV( DST, SRC )  \
do {  \
   const unsigned *_s = (const unsigned *) (SRC); \
   unsigned *_d = (unsigned *) (DST); \
   _d[0] = _s[0]; \
   _d[1] = _s[1]; \
   _d[2] = _s[2]; \
   _d[3] = _s[3]; \
} while (0)

#define COPY_4FV_NOCAST( DST, SRC )   \
do {  \
   (DST)[0] = (SRC)[0]; \
   (DST)[1] = (SRC)[1]; \
   (DST)[2] = (SRC)[2]; \
   (DST)[3] = (SRC)[3]; \
} while (0)

#define COPY_4FV_MEMCPY( DST, SRC )   \
do {  \
   memcpy(DST, SRC, sizeof(float) * 4);\
} while (0)

struct sfloat
{
   unsigned unused;
   float p[4];
};

void assign_cpy(struct sfloat *s, float *param)
{
   COPY_4FV_MEMCPY(s-p, param);
}

void assign_cast(struct sfloat *s, float *param)
{
   COPY_4FV(s-p, param);
}

void assign(struct sfloat *s, float *param)
{
   COPY_4FV_NOCAST(s-p, param);
}

int main(void)
{
   float fl[4] = {0.1,0.2,0.3,0.4};
   struct sfloat s1;
   struct sfloat s2;
   struct sfloat s3;
   assign(s1, fl);
   fprintf(stderr, assigned values are %f %f %f %f\n, s1.p[0], s1.p[1], s1.p[2], s1.p[3]);
   assign_cpy(s2, fl);
   fprintf(stderr, assigned values are %f %f %f %f\n, s2.p[0], s2.p[1], s2.p[2], s2.p[3]);
   assign_cast(s3, fl);
   fprintf(stderr, assigned values are %f %f %f %f\n, s3.p[0], s3.p[1], s3.p[2], s3.p[3]);
   return 0;
}--
Join us December 9, 2009 for the Red Hat Virtual Experience,
a free event focused on virtualization and cloud computing. 
Attend in-depth sessions from your desk. Your couch. Anywhere.
http://p.sf.net/sfu/redhat-sfdev2dev___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

[Mesa3d-dev] gallium-noblocks branch merge

Hi,

I'm planning to merge gallium-noblocks branch to master soon. This api
change may affect your driver, statetracker, whatever. I _should_ have
fixed up all in tree stuff using it, but that's not a guarantee it will
still run correctly (nv50 driver was strange for instance), and
certainly if you have out of tree things they will break.
The changes themselves should be fairly simple, you can read more about
them in the git log file.

Roland

--
Join us December 9, 2009 for the Red Hat Virtual Experience,
a free event focused on virtualization and cloud computing. 
Attend in-depth sessions from your desk. Your couch. Anywhere.
http://p.sf.net/sfu/redhat-sfdev2dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] mesa/gallium strict aliasing bugs