The difference between an easier and harder life for (some) drivers is
whether the limit is tied to hardware interpolators or not.
Once we decide to not tie it, whether the limit is 128 or 256 is of
course quite inconsequential.
Allowing arbitrary 32-bit values would however require use of binary
search or an hash table.

I think you or someone else from the Mesa team should decide how to
proceed, and most drivers would need to be fixed.

As I understand, the constraints are the following:

Hardware with no capabilities.
- nv30 does not support any mapping. However, we already need to patch
fragment programs to insert constants, so we can patch input register
numbers as well. The current driver only supports 0-7 generic indices,
but I already implemented support for 0-255 indices with in-driver
linkage and patching. Note that nv30 lacks control flow in fragment
programs.
- nv40 is like nv30, but supports fp control flow, and may have some
configurable mapping support, with unknown behavior

Hardware with capabilities that must be configured for each fp/vp pair.
- nv40 might have this but the nVidia OpenGL driver does not use them
- nv50 has configurable vp->gp and gp->fp mappings with 64 entries.
The current driver seems to support arbitrary 0-2^32 indices.
- r300 appears to have a configurable vp->fp mapping. The current
driver only supports 0-15 generic indices, but redefining
ATTR_GENERIC_COUNT could be enough to have it support larger numbers.

Hardware with automatic linkage when semantics match:
- VMWare svga appears to support 14 * 16 semantics, but the current
driver only supports 0-15 generic indices. This could be fixed by
mapping GENERIC into all non-special SM3 semantics.

Hardware that can do both configurable mappings and automatic linkage:
- r600 supports linkage in hardware between matching apparently
byte-sized semantic ids

Other hardware;
- i915 has no hardware vertex shading
- Not sure about i965

Software:
1. SM3 wants to use 14 * 16 indices overall. This is apparently only
supported by the VMware closed source state tracker.
2. SM2 and non-GLSL OpenGL just want to use as many indices as the
hardware interpolator count
3. Current GLSL currently wants to use at most about 10 indices more
than the hardware interpolator count. This can be fixed since we see
both the fragment and vertex shaders during linkage (the patch I sent
did that)
4. GLSL with EXT_separate_shader_objects does not add requirements
because only gl_TexCoord and other builtin varyings are supported.
User-defined varyings are not supported
5. An hypotetical version of EXT_separate_shader_objects extended to
support user-defining varyings would either want arbitrary 32-bit
generic indices (by interning strings to generate the indices) or the
ability to specify a custom mapping between shader indices
6. An hypotetical "no-op" implementation of the GLSL linker would have
the same requirement

Also note that non-GENERIC indices have peculiar properties.

For COLOR and BCOLOR:
1. SM3 and OpenGL with glColorClamp appropriately set wants it to
_not_ be clamped to [0, 1]
2. SM2 and normal OpenGL apparently want it to be clamped to [0, 1]
(sometimes for fixed point targets only) and may also allow using
U8_UNORM precision for it instead of FP32
3. OpenGL allows to enable two-sided lighting, in which case COLOR in
the fragment shader is automagically set to BCOLOR for back faces
4. Older hardware (e.g. nv30) tends to support BCOLOR but not FACING.
Some hardware (e.g. nv40) supports both FACING and BCOLOR in hardware.
The latest hardware probably supports FACING only.

Any API that requires special semantics for COLOR and BCOLOR (i.e.
non-SM3) seems to only want 0-1 indices.

Note that SM3 does *not* include BCOLOR, so basically the limits for
generic indices would need to be conditional on BCOLOR being present
or not (e.g. if it is present, we must reserve two semantic slots in
svga for it).

POSITION0 is obviously special.
PSIZE0 is also special for points.

FOG0 seems right now to just be a GENERIC with a single component.
Gallium could be extended to support fixed function fog, which most
DX9 hardware supports (nv30/nv40 and r300). This is mostly orthogonal
to the semantic issue.

TGSI_SEMANTIC_NORMAL is essentially unused and should probably be removed

The options are the ones you outlined, plus:
(e) Allow arbitrary 32-bit indices. This requires slightly more
complicated data structures in some cases, and will require svga and
r600 to fallback to software linkage if numbers are too high.
(f) Limit semantic indices to hardware interpolators _and_ introduce
an interface to let the user specify an

Personally I think the simplest idea for now could be to have all
drivers support 256 indices or, in the case of r600 and svga, the
maximum value supported by the hardware, and expose that as a cap (as
well as another cap for the number of different semantic values
supported at once).
The minimum guaranteed value is set to the lowest hardware constraint,
which would be svga with 219 indices (assuming no bcolor is used).
If some new constraints pop up, we just lower it and change SM3 state
trackers to check for it and fallback otherwise.

This should just require simple fixes to svga and r300, and
significant code for nv30/nv40, which is however already implemented.

------------------------------------------------------------------------------
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Reply via email to