On Sunday, July 31, 2016 12:22:40 PM PDT Francisco Jerez wrote: > Kenneth Graunke <[email protected]> writes: > > > On Friday, July 22, 2016 8:59:03 PM PDT Francisco Jerez wrote: > >> The problem with the current approach is that driver output locations > >> are represented as a linear offset within the nir_outputs array, which > >> makes it rather difficult for the back-end to figure out what color > >> output and index some nir_intrinsic_load/store_output was meant for, > >> because the offset of a given output within the nir_output array is > >> dependent on the type and size of all previously allocated outputs. > >> Instead this defines the driver location of an output to be the pair > >> formed by its GLSL-assigned location and index (I've borrowed the > >> bitfield macros from brw_defines.h in order to represent the pair of > >> integers as a single scalar value that can be assigned to > >> nir_variable_data::driver_location). nir_assign_var_locations is no > >> longer useful for fragment outputs. > >> > >> Because fragment outputs are now allocated independently rather than > >> within the nir_outputs array, the get_frag_output() helper becomes > >> necessary in order to obtain the right temporary register for a given > >> location-index pair. > >> > >> The type_size helper passed to nir_lower_io is now type_size_dvec4 > >> rather than type_size_vec4_times_4 so that output array offsets are > >> provided in terms of whole array elements rather than in terms of > >> scalar components (dvec4 is the largest vector type supported by the > >> GLSL so this will cause all individual fragment outputs to have a size > >> of one regardless of the type). > > > > I don't think you should need type_size_dvec4 - double-precision > > fragment shader outputs don't exist. GL_ARB_gpu_shader_fp64 says: > > > > (3) Should double-precision fragment shader outputs be supported? > > > > RESOLVED: Not in this extension. Note that we don't have > > double-precision framebuffer formats to accept such values. > > > > Why not just use type_size_vec4? With that changed, this patch would > > get my Reviewed-by. > > > > Yeah, I'm aware of that restriction (c.f. PATCH 7), but the point here > was to get output offsets counted in whole array elements regardless of > the type, type_size_dvec4 is strictly better for that purpose than > type_size_vec4, because dvec4 is the largest GLSL vector type, and they > are otherwise equivalent when used on types smaller than a dvec4. A > type_size_vectors helper may make sense instead though.
Ah. I was concerned about the inconsistency in this patch - you use type_size_dvec4 for lowering, but allocate the registers with type_size_vec4_times_4, which both count doubles differently. It works because there are no doubles, of course, but it seemed like using type_size_vec4 for both would be more consistent. Then again, in the next patch, you drop the type_size_vec4_times_4 call in favor of allocating VGRFs of size 4 directly. I think I see what you mean, though. I'd always thought about FRAG_RESULT_DATA* as vec4-based. But it's not, exactly...array indexes correspond to render target indexes...so we want each array element to be 1. We could use a type size function that counts 1 per array element, and ignores structs completely. But that's basically what type_size_dvec4 does, and it already exists. I suppose in the presence of theoretical double outputs, we'd still want to count by 1. We'd just want to allocate registers of size 8 if the intrinsic's source's bit size is 64 instead of 32. You win :) Reviewed-by: Kenneth Graunke <[email protected]>
signature.asc
Description: This is a digitally signed message part.
_______________________________________________ mesa-dev mailing list [email protected] https://lists.freedesktop.org/mailman/listinfo/mesa-dev
