The difference between an easier and harder life for (some) drivers is whether the limit is tied to hardware interpolators or not. Once we decide to not tie it, whether the limit is 128 or 256 is of course quite inconsequential. Allowing arbitrary 32-bit values would however require use of binary search or an hash table.
I think you or someone else from the Mesa team should decide how to proceed, and most drivers would need to be fixed. As I understand, the constraints are the following: Hardware with no capabilities. - nv30 does not support any mapping. However, we already need to patch fragment programs to insert constants, so we can patch input register numbers as well. The current driver only supports 0-7 generic indices, but I already implemented support for 0-255 indices with in-driver linkage and patching. Note that nv30 lacks control flow in fragment programs. - nv40 is like nv30, but supports fp control flow, and may have some configurable mapping support, with unknown behavior Hardware with capabilities that must be configured for each fp/vp pair. - nv40 might have this but the nVidia OpenGL driver does not use them - nv50 has configurable vp->gp and gp->fp mappings with 64 entries. The current driver seems to support arbitrary 0-2^32 indices. - r300 appears to have a configurable vp->fp mapping. The current driver only supports 0-15 generic indices, but redefining ATTR_GENERIC_COUNT could be enough to have it support larger numbers. Hardware with automatic linkage when semantics match: - VMWare svga appears to support 14 * 16 semantics, but the current driver only supports 0-15 generic indices. This could be fixed by mapping GENERIC into all non-special SM3 semantics. Hardware that can do both configurable mappings and automatic linkage: - r600 supports linkage in hardware between matching apparently byte-sized semantic ids Other hardware; - i915 has no hardware vertex shading - Not sure about i965 Software: 1. SM3 wants to use 14 * 16 indices overall. This is apparently only supported by the VMware closed source state tracker. 2. SM2 and non-GLSL OpenGL just want to use as many indices as the hardware interpolator count 3. Current GLSL currently wants to use at most about 10 indices more than the hardware interpolator count. This can be fixed since we see both the fragment and vertex shaders during linkage (the patch I sent did that) 4. GLSL with EXT_separate_shader_objects does not add requirements because only gl_TexCoord and other builtin varyings are supported. User-defined varyings are not supported 5. An hypotetical version of EXT_separate_shader_objects extended to support user-defining varyings would either want arbitrary 32-bit generic indices (by interning strings to generate the indices) or the ability to specify a custom mapping between shader indices 6. An hypotetical "no-op" implementation of the GLSL linker would have the same requirement Also note that non-GENERIC indices have peculiar properties. For COLOR and BCOLOR: 1. SM3 and OpenGL with glColorClamp appropriately set wants it to _not_ be clamped to [0, 1] 2. SM2 and normal OpenGL apparently want it to be clamped to [0, 1] (sometimes for fixed point targets only) and may also allow using U8_UNORM precision for it instead of FP32 3. OpenGL allows to enable two-sided lighting, in which case COLOR in the fragment shader is automagically set to BCOLOR for back faces 4. Older hardware (e.g. nv30) tends to support BCOLOR but not FACING. Some hardware (e.g. nv40) supports both FACING and BCOLOR in hardware. The latest hardware probably supports FACING only. Any API that requires special semantics for COLOR and BCOLOR (i.e. non-SM3) seems to only want 0-1 indices. Note that SM3 does *not* include BCOLOR, so basically the limits for generic indices would need to be conditional on BCOLOR being present or not (e.g. if it is present, we must reserve two semantic slots in svga for it). POSITION0 is obviously special. PSIZE0 is also special for points. FOG0 seems right now to just be a GENERIC with a single component. Gallium could be extended to support fixed function fog, which most DX9 hardware supports (nv30/nv40 and r300). This is mostly orthogonal to the semantic issue. TGSI_SEMANTIC_NORMAL is essentially unused and should probably be removed The options are the ones you outlined, plus: (e) Allow arbitrary 32-bit indices. This requires slightly more complicated data structures in some cases, and will require svga and r600 to fallback to software linkage if numbers are too high. (f) Limit semantic indices to hardware interpolators _and_ introduce an interface to let the user specify an Personally I think the simplest idea for now could be to have all drivers support 256 indices or, in the case of r600 and svga, the maximum value supported by the hardware, and expose that as a cap (as well as another cap for the number of different semantic values supported at once). The minimum guaranteed value is set to the lowest hardware constraint, which would be svga with 219 indices (assuming no bcolor is used). If some new constraints pop up, we just lower it and change SM3 state trackers to check for it and fallback otherwise. This should just require simple fixes to svga and r300, and significant code for nv30/nv40, which is however already implemented. ------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev