On Fri, Apr 20, 2018 at 02:16:17PM +0200, Nicolai Hähnle wrote: > On 20.04.2018 10:21, Iago Toral wrote: > >Hi, > > > >while developing support for Vulkan shaderInt16 on Anvil I came across > >a feature of NIR that was a bit inconvenient: bools are always 32-bit > >by design, but the Intel hardware produces 16-bit bool results for 16- > >bit comparisons, so that creates a problem that manifests like this: > > > >vec1 32 ssa_21 = fge ssa_20, ssa_16 > >vec1 16 ssa_22 = b2f ssa_21 > > > >Our CMP instruction will produce a 16-bit boolean result for the first > >NIR instruction (where NIR expects it to be 32-bit), so by the time we > >emit the second instruction in the driver the bit-size for the operand > >of b2f provided by NIR no longer matches the reality and we emit > >incorrect code. > > > >This seems to have been a consicious design choice in NIR, and while > >discussing this with Jason he was unsure how much we wanted to change > >this or how to do it, given how thoroughly 32-bit bools are baked into > >NIR and the complexities that modifying this would also bring to our > >bit-size validation code. > > > >I have been considering alternatives that didn't involve changing NIR > >to support multiple bit-sizes for booleans: > > > >1) Drivers that need to emit smaller booleans could try to fix the > >generated NIR by correcting the expected bit-sizes for CMP > >instructions. This would be rather trivial to implement in drivers (and > >maybe we could even make a generic pass for other drivers to use if > >they need it) but this will make the validator complain because it > >won't recognize comparisons with 16-bit bool outputs as valid NIR > >opcodes. I also found instances where nir_search would complain about > >mismatching bit-sizes. I haven't looked any further into it yet though, > >so maybe we can reasonably work around these issues. > > > >2) Drivers could handle this specially when they emit code from NIR. > >Specifically, when they see a 32-bit boolean source in an instruction, > >they would have to search for the instruction that produced that source > >value and check whether it is a 16-bit or a 32-bit boolean to emit > >proper code for the instruction.
I toyed with something similar in: https://lists.freedesktop.org/archives/mesa-dev/2017-November/178038.html That isn't thought thru at all but thought I mentioned. That patch also brings up related problem I encountered with logical ops. > > > >3) Drivers can just convert the 16-bit bool result they generate for > >16-bit cmp to the 32-bit bool that NIR expects, and then possibly run > >an optimization pass to eliminate these extra conversions and fix up > >the code accordingly. > > radeonsi(NIR) and radv already use option 3, since GCN hardware really wants > to treat bools as 1-bit value, so that's what I'd suggest. The optimizations > that cleanup the conversions happen in LLVM for us. > > Cheers, > Nicolai > > > > > >Does anyone else have any better ideas? > > > >Iago > >_______________________________________________ > >mesa-dev mailing list > >mesa-dev@lists.freedesktop.org > >https://lists.freedesktop.org/mailman/listinfo/mesa-dev > > > > > -- > Lerne, wie die Welt wirklich ist, > Aber vergiss niemals, wie sie sein sollte. > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev