On Thu, Jan 4, 2018 at 8:56 PM, Jason Ekstrand <[email protected]> wrote: > On January 4, 2018 12:51:15 Karol Herbst <[email protected]> wrote: > >> On Thu, Jan 4, 2018 at 7:06 PM, Ilia Mirkin <[email protected]> wrote: >>> >>> On Thu, Jan 4, 2018 at 10:01 AM, Karol Herbst <[email protected]> wrote: >>>> >>>> significant changes to last series: >>>> * arb_gpu_shader5 interpolateat* (those nir ops don't map well to nvir) >>>> no good plan on how to properly implement those >>> >>> >>> What's the issue? They should map as well as the TGSI ones. (Since the >>> TGSI ones are just the GLSL ones.) >>> >> >> it is a bit ugly, because usually all inputs vars are lowered away, so >> that they are inputs. So they need special handling; >> >> lowered (input is centroid): >> vec1 32 ssa_25 = intrinsic load_input (ssa_24) () (0, 0) /* base=0 */ >> /* component=0 */ /* packed:centroid_qualified */ >> vec1 32 ssa_27 = intrinsic load_input (ssa_26) () (0, 1) /* base=0 */ >> /* component=1 */ /* packed:centroid_qualified */ >> >> not lowered: >> decl_var INTERP_MODE_NONE vec2 in@unqualified-temp >> vec2 32 ssa_11 = intrinsic interp_var_at_centroid () (in@unqualified-temp) >> () >> >> I kind of wished I could have a load_input intrinsic with a flag or >> load_input_at_centroid, so that I end up with the same code in the >> end. > > > In i965, we use the NIR explicit input interpolation intrinsics. I'm on my > phone so I can't give more details easily. > >>>> * arb_gpu_shader5.texturegatheroffsets (nir internal assert) >>>> glsl_to_nir.cpp:2082: virtual void >>>> {anonymous}::nir_visitor::visit(ir_texture*): Assertion >>>> `ir->offset->type->is_vector() || ir->offset->type->is_scalar()' failed. >>> >>> >>> This is because nir doesn't support the 4-offset tg4 variant. This is >>> expected (by nir) to be lowered in GLSL to 4 separate gathers, but >>> isn't because nvc0 doesn't set the caps to make st/mesa do that. >>> Either set that cap based on whether NIR is used, or teach nir about >>> the 4-offset tg4 (which the nvidia hw supports directly btw). >>> >> >> well I would prefer the last one obviously, but nir gives me a >> nir_texop_tg4 in other tests, it is just those mentioned above where >> it fails. > > > I would prefer that as well. There's no reason NIR can't support it so we > may as well add support. We should also move the lowering from spirv_to_nit > to nir_lower_tex so that spirv_to_nir can give you the unlowered version you > want. > > >>>> * some int64 stuff related to compound types >>> >>> >>> As I mentioned, you either have to fix RA (I don't recommend this), or >>> you have to stop using 64-bit Value's for storage. Use 32-bit Value's, >>> and merge/split them all the time around 64-bit ops like the TGSI FE >>> does (which was implemented that way largely due to the way TGSI >>> works, but is a happy coincidence that it also works around some of >>> the RA shortcomings). And additionally you may need to improve the >>> merge splits pass to avoid some of the pain. >>> >>> You could also just disable int64 for now - it's not important. >>> >>>> * various extensions >>>> * variable-indexing (related to above mentioned packing issue) >>>> * glsl-4.20.execution.vs_in >>>> * some variable-indexing issues related to unaligned memory accesses >>> >>> >>> The variable-indexing stuff is extremely important to work out, since >>> it belies a fundamental problem in some approach to the conversion. >>> >> >> well the normal variable indexing stuff works if I disable >> nir_compact_varyings, which we might want to do anyway for nouveau for >> now. Or I teach memorOpt to not merge things for unaligned addresses. >> >> I have to take a more focused look at the fails anyway >> >>>> * some geometry shader fails >>> >>> >>> Have you done any testing with nv50? It should largely work out, but >>> there are some things you have to be careful about. The TGSI frontend >>> generates IR that is capable of being processed by both the nv50 and >>> nvc0 lowering/RA/emission logic, would want to ensure that an nir >>> frontend would be able to do this too. If you don't have access to a >>> Tesla-era GPU, I can act as a tester in a limited capacity. >>> >> >> I have a tesla GPU. >> >>> Sounds like this is still all pretty experimental and has a lot of >>> deep issues given the fail/crash count... IMHO not ready for merging. >>> Also you really need to come up with a workable solution to the >>> immediates issue. >>> >> >> well I could just store them like it is done with TGSI and just put >> loadImms where accessed, but this doesn't really fit the NIR logic >> here. Maybe there is a NIR pass to move them around, so that the issue >> is less significant. Or maybe I always check if the source contains a >> const value and use loadImm instead of getting the stored immediate >> value. Yeah I think the last idea would be less painful, we just end >> up with more dead instructions after converting. > > > What is the nature of the immediate problem? We may have a similar issue. > >
we don't do rescheduling, so all the immediates are at the top of the shader. _______________________________________________ mesa-dev mailing list [email protected] https://lists.freedesktop.org/mailman/listinfo/mesa-dev
