On Tue, 2019-03-12 at 15:46 -0700, Francisco Jerez wrote: > Iago Toral <ito...@igalia.com> writes: > > > On Wed, 2019-03-06 at 09:21 +0100, Iago Toral wrote: > > > On Tue, 2019-03-05 at 07:35 +0100, Iago Toral wrote: > > > > On Mon, 2019-03-04 at 15:36 -0800, Francisco Jerez wrote: > > > > > Iago Toral <ito...@igalia.com> writes: > > > > > > > > > > > On Fri, 2019-03-01 at 19:04 -0800, Francisco Jerez wrote: > > > > > > > Iago Toral <ito...@igalia.com> writes: > > > > > > > > > > > > > > > On Thu, 2019-02-28 at 09:54 -0800, Francisco Jerez > > > > > > > > wrote: > > > > > > > > > Iago Toral <ito...@igalia.com> writes: > > > > > > > > > > > > > > > > > > > On Wed, 2019-02-27 at 13:47 -0800, Francisco Jerez > > > > > > > > > > wrote: > > > > > > > > > > > Iago Toral <ito...@igalia.com> writes: > > > > > > > > > > > > > > > > > > > > > > > On Tue, 2019-02-26 at 14:54 -0800, Francisco > > > > > > > > > > > > Jerez > > > > > > > > > > > > wrote: > > > > > > > > > > > > > Iago Toral Quiroga <ito...@igalia.com> > > > > > > > > > > > > > writes: > > > > > > > > > > > > > > > > > > > > > > > > > > > --- > > > > > > > > > > > > > > src/intel/compiler/brw_eu_validate.c | > > > > > > > > > > > > > > 64 > > > > > > > > > > > > > > ++++++++++++- > > > > > > > > > > > > > > src/intel/compiler/test_eu_validate.cpp | > > > > > > > > > > > > > > 122 > > > > > > > > > > > > > > ++++++++++++++++++++++++ > > > > > > > > > > > > > > 2 files changed, 185 insertions(+), 1 > > > > > > > > > > > > > > deletion(- > > > > > > > > > > > > > > ) > > > > > > > > > > > > > > > > > > > > > > > > > > > > diff --git > > > > > > > > > > > > > > a/src/intel/compiler/brw_eu_validate.c > > > > > > > > > > > > > > b/src/intel/compiler/brw_eu_validate.c > > > > > > > > > > > > > > index 000a05cb6ac..203641fecb9 100644 > > > > > > > > > > > > > > --- a/src/intel/compiler/brw_eu_validate.c > > > > > > > > > > > > > > +++ b/src/intel/compiler/brw_eu_validate.c > > > > > > > > > > > > > > @@ -531,7 +531,69 @@ > > > > > > > > > > > > > > general_restrictions_based_on_operand_types > > > > > > > > > > > > > > (con > > > > > > > > > > > > > > st > > > > > > > > > > > > > > struct > > > > > > > > > > > > > > gen_device_info *devinf > > > > > > > > > > > > > > exec_type_size == 8 && > > > > > > > > > > > > > > dst_type_size == > > > > > > > > > > > > > > 4) > > > > > > > > > > > > > > dst_type_size = 8; > > > > > > > > > > > > > > > > > > > > > > > > > > > > - if (exec_type_size > dst_type_size) { > > > > > > > > > > > > > > + /* From the BDW+ PRM: > > > > > > > > > > > > > > + * > > > > > > > > > > > > > > + * "There is no direct conversion > > > > > > > > > > > > > > from > > > > > > > > > > > > > > HF > > > > > > > > > > > > > > to > > > > > > > > > > > > > > DF > > > > > > > > > > > > > > or > > > > > > > > > > > > > > DF to > > > > > > > > > > > > > > HF. > > > > > > > > > > > > > > + * There is no direct conversion > > > > > > > > > > > > > > from > > > > > > > > > > > > > > HF > > > > > > > > > > > > > > to > > > > > > > > > > > > > > Q/UQ or > > > > > > > > > > > > > > Q/UQ to > > > > > > > > > > > > > > HF." > > > > > > > > > > > > > > + */ > > > > > > > > > > > > > > + enum brw_reg_type src0_type = > > > > > > > > > > > > > > brw_inst_src0_type(devinfo, > > > > > > > > > > > > > > inst); > > > > > > > > > > > > > > + ERROR_IF(brw_inst_opcode(devinfo, inst) > > > > > > > > > > > > > > == > > > > > > > > > > > > > > BRW_OPCODE_MOV > > > > > > > > > > > > > > && > > > > > > > > > > > > > > > > > > > > > > > > > > Why is only the MOV instruction handled here > > > > > > > > > > > > > and > > > > > > > > > > > > > below? Aren't > > > > > > > > > > > > > other > > > > > > > > > > > > > instructions able to do implicit > > > > > > > > > > > > > conversions? Probably > > > > > > > > > > > > > means > > > > > > > > > > > > > you > > > > > > > > > > > > > need > > > > > > > > > > > > > to deal with two sources rather than one. > > > > > > > > > > > > > > > > > > > > > > > > This comes from the programming notes of the > > > > > > > > > > > > MOV > > > > > > > > > > > > instruction > > > > > > > > > > > > (Volume > > > > > > > > > > > > 2a, Command Reference - Instructions - MOV), so > > > > > > > > > > > > it > > > > > > > > > > > > is > > > > > > > > > > > > described > > > > > > > > > > > > specifically for the MOV instruction. I should > > > > > > > > > > > > probably > > > > > > > > > > > > have > > > > > > > > > > > > made > > > > > > > > > > > > this > > > > > > > > > > > > clear in the comment. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Maybe the one above is specified in the MOV page > > > > > > > > > > > only, > > > > > > > > > > > probably > > > > > > > > > > > due > > > > > > > > > > > to > > > > > > > > > > > an oversight (If these restrictions were really > > > > > > > > > > > specific > > > > > > > > > > > to > > > > > > > > > > > the > > > > > > > > > > > MOV > > > > > > > > > > > instruction, what would prevent you from > > > > > > > > > > > implementing > > > > > > > > > > > such > > > > > > > > > > > conversions > > > > > > > > > > > through a different instruction? E.g. "ADD > > > > > > > > > > > dst:df, > > > > > > > > > > > src:hf, > > > > > > > > > > > 0" > > > > > > > > > > > which > > > > > > > > > > > would be substantially more efficient than what > > > > > > > > > > > you're > > > > > > > > > > > doing > > > > > > > > > > > in > > > > > > > > > > > PATCH > > > > > > > > > > > 02) > > > > > > > > > > > > > > > > > > > > Instructions that take HF can only be strictly HF > > > > > > > > > > or > > > > > > > > > > mix > > > > > > > > > > F > > > > > > > > > > and > > > > > > > > > > HF > > > > > > > > > > (mixed mode float), with MOV being the only > > > > > > > > > > exception. > > > > > > > > > > That > > > > > > > > > > means > > > > > > > > > > that > > > > > > > > > > any instruction like the one you use above are > > > > > > > > > > invalid. > > > > > > > > > > Maybe > > > > > > > > > > we > > > > > > > > > > should > > > > > > > > > > validate explicitly that instructions that use HF > > > > > > > > > > are > > > > > > > > > > strictly > > > > > > > > > > HF > > > > > > > > > > or > > > > > > > > > > mixed-float mode only? > > > > > > > > > > > > > > > > > > > > > > > > > > > > So you're acknowledging that the conversions checked > > > > > > > > > above > > > > > > > > > are > > > > > > > > > illegal > > > > > > > > > whether the instruction is a MOV or something > > > > > > > > > else. Where > > > > > > > > > are we > > > > > > > > > preventing instructions other than MOV with such > > > > > > > > > conversions > > > > > > > > > from > > > > > > > > > being > > > > > > > > > accepted by the validator? > > > > > > > > > > > > > > > > We aren't, because the validator is not checking, in > > > > > > > > general, > > > > > > > > for > > > > > > > > accepted type combinations for specific instructions > > > > > > > > anywhere > > > > > > > > as > > > > > > > > far as > > > > > > > > I can see. > > > > > > > > > > > > > > Luckily these type conversion restrictions aren't really > > > > > > > specific > > > > > > > to > > > > > > > any > > > > > > > instruction AFAICT, even though they only seem to be > > > > > > > mentioned > > > > > > > explicitly for the MOV instruction... > > > > > > > > > > > > > > > If we want to start doing this with HF conversions > > > > > > > > specifically, I > > > > > > > > guess it is fine, but in case it is not clear, I think > > > > > > > > it > > > > > > > > won't > > > > > > > > bring > > > > > > > > any immediate benefits with the > > > > > > > > VK_KHR_shader_float16_int8 > > > > > > > > implementation since this series only ever emits > > > > > > > > conversions > > > > > > > > involving > > > > > > > > HF operands via MOV instructions, > > > > > > > > > > > > > > Yes, I can see that's the intention of this series, but > > > > > > > how > > > > > > > can > > > > > > > you > > > > > > > make > > > > > > > sure that nothing in the optimizer is breaking your > > > > > > > assumption > > > > > > > if > > > > > > > you > > > > > > > don't add some validator code to verify the claim of your > > > > > > > last > > > > > > > paragraph? > > > > > > > > > > > > > > > which is why I thought that validating that no direct > > > > > > > > MOV > > > > > > > > conversions > > > > > > > > from DF/Q types ever happen was useful, since we have > > > > > > > > code > > > > > > > > in > > > > > > > > the > > > > > > > > driver to handle this scenario. > > > > > > > > > > > > > > > > [...] > > > > > > > > > > > > > > > > > > I still don't understand why you want to implement > > > > > > > > > the > > > > > > > > > same > > > > > > > > > restriction > > > > > > > > > twice, once for MOV and once for all other mixed-mode > > > > > > > > > instructions. How > > > > > > > > > is that more convenient? > > > > > > > > > > > > > > > > The restrictions based on operand types that are > > > > > > > > checked in > > > > > > > > the > > > > > > > > validator are specific to Byte or cases where the > > > > > > > > execution > > > > > > > > type is > > > > > > > > larger than the destination stride, for which mixed > > > > > > > > float > > > > > > > > has > > > > > > > > a > > > > > > > > different set of restrictions. > > > > > > > > > > > > > > > > For example, for mixed float we have this rule: > > > > > > > > > > > > > > > > "In Align1, destination stride can be smaller than > > > > > > > > execution > > > > > > > > type" > > > > > > > > > > > > > > > > Which overrides this other from "General restrictions > > > > > > > > based > > > > > > > > on > > > > > > > > operand > > > > > > > > types": > > > > > > > > > > > > > > > > "Destination stride must be equal to the ratio of the > > > > > > > > sizes > > > > > > > > of > > > > > > > > the > > > > > > > > execution data type to the destination type" > > > > > > > > > > > > > > > > So I thought that it would make things easier to keep > > > > > > > > all > > > > > > > > restrictions > > > > > > > > for mixed float separate and make sure that we skipped > > > > > > > > mixed > > > > > > > > float > > > > > > > > instructions in > > > > > > > > general_restrictions_based_on_operand_types() > > > > > > > > so we > > > > > > > > avoid having to add code to skip individual general > > > > > > > > restrictions > > > > > > > > that that are overriden for mixed float mode anyway. > > > > > > > > > > > > > > > > > > > > > > I'm fine with that, but it doesn't seem like what this > > > > > > > patch > > > > > > > is > > > > > > > doing... > > > > > > > Isn't it adding code to validate mixed-mode float MOV > > > > > > > instructions in > > > > > > > general_restrictions_based_on_operand_types()? > > > > > > > > > > > > I guess this could be arguable, but I was not considering > > > > > > conversion > > > > > > MOVs to be mixed-float instructions. There are two reasons > > > > > > for > > > > > > this: > > > > > > > > > > > > A conversion MOV involving F/HF doesn't seem to be > > > > > > fundamentally > > > > > > different from any other conversion instruction involving > > > > > > other > > > > > > types, > > > > > > other than the requirement of aligning the destination to a > > > > > > Dword, > > > > > > which is not a resriction explictly made for mixed-float > > > > > > mode. > > > > > > > > > > > > Then, for mixed-float mode, there is this other rule: > > > > > > > > > > > > "In Align1, destination stride can be smaller than > > > > > > execution > > > > > > type. > > > > > > When > > > > > > destination is stride of 1, 16 bit packed data > > > > > > is updated on the destination. However, output packed f16 > > > > > > data > > > > > > must > > > > > > be > > > > > > oword aligned, no oword crossing in > > > > > > packed f16." > > > > > > > > > > > > Which contradicts the other rule that conversions from F to > > > > > > HF > > > > > > need > > > > > > to > > > > > > be DWord aligned on the destination. > > > > > > > > > > > > So it seems to me that conversion MOVs are not following > > > > > > the > > > > > > same > > > > > > principles of mixed-float instructions and we should skip > > > > > > validating > > > > > > mixed-float restrictions for them. What do you think? > > > > > > > > > > > > > > > > That all seems fairly ambiguous... And the restriction on > > > > > DWORD > > > > > alignment for conversions includes a mixed-mode ADD > > > > > instruction > > > > > among > > > > > the examples, so I'm skeptical that MOV could be truly > > > > > fundamentally > > > > > different. > > > > > > > > Ok... in that case what do we do about the mixed-float > > > > restriction > > > > I > > > > quoted above? Since it is incompatible with the mixed-float MOV > > > > conversion I guess we only have two options: ether we don't > > > > validate > > > > it > > > > at all or we only validate it for mixed-float instructions that > > > > are > > > > not > > > > MOV. I guess we can do the latter? > > > > > > Also, related to this, if we consider implicit conversions in 2- > > > src > > > instructions to also be a target of the generic rules for > > > conversions > > > to HF described for CHV and SKL+, then doesn't that imply that > > > all > > > mixed-float instructions are conversions and subject to those > > > rules? > > > For example: > > > > > > ADD dst:hf src0:f src1:f > > > ADD dst:hf src0:hf src1:f > > > ADD dst:hf src0:f src1:hf > > > > > > In all 3 instructions above the execution type is F. Should we > > > consider > > > all of them implicit conversions? That would mean that any mixed- > > > float > > > instruction with HF destination is an implicit conversion from F > > > to > > > HF > > > and therefore would be subject to the generic rules for those > > > conversions, which at least in the case of CHV and SKL+ involves > > > DWord > > > alignment on the destination, or in other words: no mixed-float > > > instruction can have packed fp16 destinations. But that seems to > > > contradict what various mixed-float restrictions present in the > > > docs > > > that suggest that packed fp16 destinations are possible with > > > mixed- > > > float mode. Here are some examples: > > > > > > "No SIMD16 in mixed mode when destination is packed f16 for both > > > Align1 > > > and Align16" > > > > > > "In Align1, destination stride can be smaller than execution > > > type. > > > When > > > destination is stride of 1, 16 bit packed data is updated on the > > > destination. However, output packed f16 data must be oword > > > aligned, > > > no > > > oword crossing in packed f16." > > > > > > "When source is float or half float from accumulator register and > > > destination is half float with a stride of 1, the source must > > > register > > > aligned. i.e., source must have offset zero." > > > > > > Since I was only considering that conversion rules only applied > > > to > > > MOV > > > instructions this was not an issue in my series, but if we > > > consider > > > that those rules also apply to implicit conversions from other > > > instructions then it looks like all these restrictions refer to > > > impossible situations, so I am not sure what to do about them... > > > should > > > we just ignore them in the validator? > > > > And I have just noticed another issue with the idea of considering > > direct conversions involving F/HF mixed-float instructions. There > > is > > this other restriction for mixed-float mode: > > > > "No SIMD16 in mixed mode when destination is f32. Instruction > > Execution > > size must be no more than 8." > > > > So if we consider direct HF->F conversions mixed-float instructions > > then we would habe to split all of them to be 8-wide. Obviously, I > > have > > not been doing this and I have never seen any issues on any > > platform > > because of this. > > > > Hm... That's kind of scary... Your code seems to be violating this > and > the other no-SIMD16 restriction of mixed-mode instructions to my best > interpretation of the B-Spec. Unfortunately the simulator doesn't > seem > to enforce either of them so I'm not certain whether it's going to > blow > up in practice or not. Maybe implement the restriction in > get_fpu_lowered_simd_width() to be on the safe side?
Yeah... this is quite unfortunate. I have not seen any problem with this on any platform I have tested, and I guess that the fact that the simulator doesn't complain about them might be another hint that this is actually not a problem but it is annoying to have the docs and the testing give us different info :-(. It is particularly bad in this case, because when you start mixing half-float and float you inevitably get a lot of conversions between the two types along the way (for example, there are a bunch of instructions that do not support half-float so we need to convert to F), so if we decide to split conversions with F destination I imagine it is going to have a visible negative effect on performance, which is particularly bad if it ends up being that there is not such restriction for MOV in practice after all. Anyway, I suppose taking the safe approach could make sense... > > Iago > > > > > > > > > > > > > > > > > Anyway, I'll try to rework the patches to do more > > > > > > > > generic > > > > > > > > validation of > > > > > > > > HF conversions like you ask and see what kind of code > > > > > > > > we > > > > > > > > end > > > > > > > > up > > > > > > > > with. > > > > > > > > > > > > > > > > > > > > > > Thanks. > > > > > > > > > > > > > > > [...] _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev