Iago Toral <ito...@igalia.com> writes: > On Wed, 2019-03-06 at 09:21 +0100, Iago Toral wrote: >> On Tue, 2019-03-05 at 07:35 +0100, Iago Toral wrote: >> > On Mon, 2019-03-04 at 15:36 -0800, Francisco Jerez wrote: >> > > Iago Toral <ito...@igalia.com> writes: >> > > >> > > > On Fri, 2019-03-01 at 19:04 -0800, Francisco Jerez wrote: >> > > > > Iago Toral <ito...@igalia.com> writes: >> > > > > >> > > > > > On Thu, 2019-02-28 at 09:54 -0800, Francisco Jerez wrote: >> > > > > > > Iago Toral <ito...@igalia.com> writes: >> > > > > > > >> > > > > > > > On Wed, 2019-02-27 at 13:47 -0800, Francisco Jerez >> > > > > > > > wrote: >> > > > > > > > > Iago Toral <ito...@igalia.com> writes: >> > > > > > > > > >> > > > > > > > > > On Tue, 2019-02-26 at 14:54 -0800, Francisco Jerez >> > > > > > > > > > wrote: >> > > > > > > > > > > Iago Toral Quiroga <ito...@igalia.com> writes: >> > > > > > > > > > > >> > > > > > > > > > > > --- >> > > > > > > > > > > > src/intel/compiler/brw_eu_validate.c | 64 >> > > > > > > > > > > > ++++++++++++- >> > > > > > > > > > > > src/intel/compiler/test_eu_validate.cpp | 122 >> > > > > > > > > > > > ++++++++++++++++++++++++ >> > > > > > > > > > > > 2 files changed, 185 insertions(+), 1 >> > > > > > > > > > > > deletion(- >> > > > > > > > > > > > ) >> > > > > > > > > > > > >> > > > > > > > > > > > diff --git >> > > > > > > > > > > > a/src/intel/compiler/brw_eu_validate.c >> > > > > > > > > > > > b/src/intel/compiler/brw_eu_validate.c >> > > > > > > > > > > > index 000a05cb6ac..203641fecb9 100644 >> > > > > > > > > > > > --- a/src/intel/compiler/brw_eu_validate.c >> > > > > > > > > > > > +++ b/src/intel/compiler/brw_eu_validate.c >> > > > > > > > > > > > @@ -531,7 +531,69 @@ >> > > > > > > > > > > > general_restrictions_based_on_operand_types(con >> > > > > > > > > > > > st >> > > > > > > > > > > > struct >> > > > > > > > > > > > gen_device_info *devinf >> > > > > > > > > > > > exec_type_size == 8 && dst_type_size == >> > > > > > > > > > > > 4) >> > > > > > > > > > > > dst_type_size = 8; >> > > > > > > > > > > > >> > > > > > > > > > > > - if (exec_type_size > dst_type_size) { >> > > > > > > > > > > > + /* From the BDW+ PRM: >> > > > > > > > > > > > + * >> > > > > > > > > > > > + * "There is no direct conversion from >> > > > > > > > > > > > HF >> > > > > > > > > > > > to >> > > > > > > > > > > > DF >> > > > > > > > > > > > or >> > > > > > > > > > > > DF to >> > > > > > > > > > > > HF. >> > > > > > > > > > > > + * There is no direct conversion from >> > > > > > > > > > > > HF >> > > > > > > > > > > > to >> > > > > > > > > > > > Q/UQ or >> > > > > > > > > > > > Q/UQ to >> > > > > > > > > > > > HF." >> > > > > > > > > > > > + */ >> > > > > > > > > > > > + enum brw_reg_type src0_type = >> > > > > > > > > > > > brw_inst_src0_type(devinfo, >> > > > > > > > > > > > inst); >> > > > > > > > > > > > + ERROR_IF(brw_inst_opcode(devinfo, inst) == >> > > > > > > > > > > > BRW_OPCODE_MOV >> > > > > > > > > > > > && >> > > > > > > > > > > >> > > > > > > > > > > Why is only the MOV instruction handled here and >> > > > > > > > > > > below? Aren't >> > > > > > > > > > > other >> > > > > > > > > > > instructions able to do implicit >> > > > > > > > > > > conversions? Probably >> > > > > > > > > > > means >> > > > > > > > > > > you >> > > > > > > > > > > need >> > > > > > > > > > > to deal with two sources rather than one. >> > > > > > > > > > >> > > > > > > > > > This comes from the programming notes of the MOV >> > > > > > > > > > instruction >> > > > > > > > > > (Volume >> > > > > > > > > > 2a, Command Reference - Instructions - MOV), so it >> > > > > > > > > > is >> > > > > > > > > > described >> > > > > > > > > > specifically for the MOV instruction. I should >> > > > > > > > > > probably >> > > > > > > > > > have >> > > > > > > > > > made >> > > > > > > > > > this >> > > > > > > > > > clear in the comment. >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > > Maybe the one above is specified in the MOV page >> > > > > > > > > only, >> > > > > > > > > probably >> > > > > > > > > due >> > > > > > > > > to >> > > > > > > > > an oversight (If these restrictions were really >> > > > > > > > > specific >> > > > > > > > > to >> > > > > > > > > the >> > > > > > > > > MOV >> > > > > > > > > instruction, what would prevent you from implementing >> > > > > > > > > such >> > > > > > > > > conversions >> > > > > > > > > through a different instruction? E.g. "ADD dst:df, >> > > > > > > > > src:hf, >> > > > > > > > > 0" >> > > > > > > > > which >> > > > > > > > > would be substantially more efficient than what >> > > > > > > > > you're >> > > > > > > > > doing >> > > > > > > > > in >> > > > > > > > > PATCH >> > > > > > > > > 02) >> > > > > > > > >> > > > > > > > Instructions that take HF can only be strictly HF or >> > > > > > > > mix >> > > > > > > > F >> > > > > > > > and >> > > > > > > > HF >> > > > > > > > (mixed mode float), with MOV being the only exception. >> > > > > > > > That >> > > > > > > > means >> > > > > > > > that >> > > > > > > > any instruction like the one you use above are invalid. >> > > > > > > > Maybe >> > > > > > > > we >> > > > > > > > should >> > > > > > > > validate explicitly that instructions that use HF are >> > > > > > > > strictly >> > > > > > > > HF >> > > > > > > > or >> > > > > > > > mixed-float mode only? >> > > > > > > > >> > > > > > > >> > > > > > > So you're acknowledging that the conversions checked >> > > > > > > above >> > > > > > > are >> > > > > > > illegal >> > > > > > > whether the instruction is a MOV or something >> > > > > > > else. Where >> > > > > > > are we >> > > > > > > preventing instructions other than MOV with such >> > > > > > > conversions >> > > > > > > from >> > > > > > > being >> > > > > > > accepted by the validator? >> > > > > > >> > > > > > We aren't, because the validator is not checking, in >> > > > > > general, >> > > > > > for >> > > > > > accepted type combinations for specific instructions >> > > > > > anywhere >> > > > > > as >> > > > > > far as >> > > > > > I can see. >> > > > > >> > > > > Luckily these type conversion restrictions aren't really >> > > > > specific >> > > > > to >> > > > > any >> > > > > instruction AFAICT, even though they only seem to be >> > > > > mentioned >> > > > > explicitly for the MOV instruction... >> > > > > >> > > > > > If we want to start doing this with HF conversions >> > > > > > specifically, I >> > > > > > guess it is fine, but in case it is not clear, I think it >> > > > > > won't >> > > > > > bring >> > > > > > any immediate benefits with the VK_KHR_shader_float16_int8 >> > > > > > implementation since this series only ever emits >> > > > > > conversions >> > > > > > involving >> > > > > > HF operands via MOV instructions, >> > > > > >> > > > > Yes, I can see that's the intention of this series, but how >> > > > > can >> > > > > you >> > > > > make >> > > > > sure that nothing in the optimizer is breaking your >> > > > > assumption >> > > > > if >> > > > > you >> > > > > don't add some validator code to verify the claim of your >> > > > > last >> > > > > paragraph? >> > > > > >> > > > > > which is why I thought that validating that no direct MOV >> > > > > > conversions >> > > > > > from DF/Q types ever happen was useful, since we have code >> > > > > > in >> > > > > > the >> > > > > > driver to handle this scenario. >> > > > > > >> > > > > > [...] >> > > > > > > >> > > > > > > I still don't understand why you want to implement the >> > > > > > > same >> > > > > > > restriction >> > > > > > > twice, once for MOV and once for all other mixed-mode >> > > > > > > instructions. How >> > > > > > > is that more convenient? >> > > > > > >> > > > > > The restrictions based on operand types that are checked in >> > > > > > the >> > > > > > validator are specific to Byte or cases where the execution >> > > > > > type is >> > > > > > larger than the destination stride, for which mixed float >> > > > > > has >> > > > > > a >> > > > > > different set of restrictions. >> > > > > > >> > > > > > For example, for mixed float we have this rule: >> > > > > > >> > > > > > "In Align1, destination stride can be smaller than >> > > > > > execution >> > > > > > type" >> > > > > > >> > > > > > Which overrides this other from "General restrictions based >> > > > > > on >> > > > > > operand >> > > > > > types": >> > > > > > >> > > > > > "Destination stride must be equal to the ratio of the sizes >> > > > > > of >> > > > > > the >> > > > > > execution data type to the destination type" >> > > > > > >> > > > > > So I thought that it would make things easier to keep all >> > > > > > restrictions >> > > > > > for mixed float separate and make sure that we skipped >> > > > > > mixed >> > > > > > float >> > > > > > instructions in >> > > > > > general_restrictions_based_on_operand_types() >> > > > > > so we >> > > > > > avoid having to add code to skip individual general >> > > > > > restrictions >> > > > > > that that are overriden for mixed float mode anyway. >> > > > > > >> > > > > >> > > > > I'm fine with that, but it doesn't seem like what this patch >> > > > > is >> > > > > doing... >> > > > > Isn't it adding code to validate mixed-mode float MOV >> > > > > instructions in >> > > > > general_restrictions_based_on_operand_types()? >> > > > >> > > > I guess this could be arguable, but I was not considering >> > > > conversion >> > > > MOVs to be mixed-float instructions. There are two reasons for >> > > > this: >> > > > >> > > > A conversion MOV involving F/HF doesn't seem to be >> > > > fundamentally >> > > > different from any other conversion instruction involving other >> > > > types, >> > > > other than the requirement of aligning the destination to a >> > > > Dword, >> > > > which is not a resriction explictly made for mixed-float mode. >> > > > >> > > > Then, for mixed-float mode, there is this other rule: >> > > > >> > > > "In Align1, destination stride can be smaller than execution >> > > > type. >> > > > When >> > > > destination is stride of 1, 16 bit packed data >> > > > is updated on the destination. However, output packed f16 data >> > > > must >> > > > be >> > > > oword aligned, no oword crossing in >> > > > packed f16." >> > > > >> > > > Which contradicts the other rule that conversions from F to HF >> > > > need >> > > > to >> > > > be DWord aligned on the destination. >> > > > >> > > > So it seems to me that conversion MOVs are not following the >> > > > same >> > > > principles of mixed-float instructions and we should skip >> > > > validating >> > > > mixed-float restrictions for them. What do you think? >> > > > >> > > >> > > That all seems fairly ambiguous... And the restriction on DWORD >> > > alignment for conversions includes a mixed-mode ADD instruction >> > > among >> > > the examples, so I'm skeptical that MOV could be truly >> > > fundamentally >> > > different. >> > >> > Ok... in that case what do we do about the mixed-float restriction >> > I >> > quoted above? Since it is incompatible with the mixed-float MOV >> > conversion I guess we only have two options: ether we don't >> > validate >> > it >> > at all or we only validate it for mixed-float instructions that are >> > not >> > MOV. I guess we can do the latter? >> >> Also, related to this, if we consider implicit conversions in 2-src >> instructions to also be a target of the generic rules for conversions >> to HF described for CHV and SKL+, then doesn't that imply that all >> mixed-float instructions are conversions and subject to those rules? >> For example: >> >> ADD dst:hf src0:f src1:f >> ADD dst:hf src0:hf src1:f >> ADD dst:hf src0:f src1:hf >> >> In all 3 instructions above the execution type is F. Should we >> consider >> all of them implicit conversions? That would mean that any mixed- >> float >> instruction with HF destination is an implicit conversion from F to >> HF >> and therefore would be subject to the generic rules for those >> conversions, which at least in the case of CHV and SKL+ involves >> DWord >> alignment on the destination, or in other words: no mixed-float >> instruction can have packed fp16 destinations. But that seems to >> contradict what various mixed-float restrictions present in the docs >> that suggest that packed fp16 destinations are possible with mixed- >> float mode. Here are some examples: >> >> "No SIMD16 in mixed mode when destination is packed f16 for both >> Align1 >> and Align16" >> >> "In Align1, destination stride can be smaller than execution type. >> When >> destination is stride of 1, 16 bit packed data is updated on the >> destination. However, output packed f16 data must be oword aligned, >> no >> oword crossing in packed f16." >> >> "When source is float or half float from accumulator register and >> destination is half float with a stride of 1, the source must >> register >> aligned. i.e., source must have offset zero." >> >> Since I was only considering that conversion rules only applied to >> MOV >> instructions this was not an issue in my series, but if we consider >> that those rules also apply to implicit conversions from other >> instructions then it looks like all these restrictions refer to >> impossible situations, so I am not sure what to do about them... >> should >> we just ignore them in the validator? > > And I have just noticed another issue with the idea of considering > direct conversions involving F/HF mixed-float instructions. There is > this other restriction for mixed-float mode: > > "No SIMD16 in mixed mode when destination is f32. Instruction Execution > size must be no more than 8." > > So if we consider direct HF->F conversions mixed-float instructions > then we would habe to split all of them to be 8-wide. Obviously, I have > not been doing this and I have never seen any issues on any platform > because of this. >
Hm... That's kind of scary... Your code seems to be violating this and the other no-SIMD16 restriction of mixed-mode instructions to my best interpretation of the B-Spec. Unfortunately the simulator doesn't seem to enforce either of them so I'm not certain whether it's going to blow up in practice or not. Maybe implement the restriction in get_fpu_lowered_simd_width() to be on the safe side? > Iago > > > >> >> > > > > > Anyway, I'll try to rework the patches to do more generic >> > > > > > validation of >> > > > > > HF conversions like you ask and see what kind of code we >> > > > > > end >> > > > > > up >> > > > > > with. >> > > > > > >> > > > > >> > > > > Thanks. >> > > > > >> > > > > > [...]
signature.asc
Description: PGP signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev