> > > > +
> > > > +  void SelBasicBlockOptimizer::changeInsideReplaceInfos(const
> > > > + SelectionInstruction& insn, GenRegister& var)  {
> > > > +    for (ReplaceInfo* info : replaceInfos) {
> > > > +      if (info->intermedia.reg() == var.reg()) {
> > A new comment here is that the above loop is a little bit too heavy.
> > For a large BB which has many MOV instructions, it will iterate too many 
> > times for instructions after those MOV instructions. A better way is to 
> > change to use a new map type of replaceInfos:
> > map <ir::Register, set<ReplaceInfo*>>
> > 
> > This will be much faster than iterate all infos for each instruction.
> > 
> > [Yejun] nice, I'll change to map.
> > 


Look into the code again, and found the following check is interesting and not 
simple at all :)

+        if (insn.opcode != SEL_OP_BSWAP && !insn.isWrite()) {

1. Why BSWAP is special here? The root cause may be that the src of bswap will 
be modified.
   But should this be handled in the BSWAP instruction already (I mean, both 
src and
   dst of BSWAP should be put into  src array and dst array ).
   Unfortunately, BSWAP seems have a bug here and may lead to incorrect 
scheduling latter.
   That should be fixed in BSWAP, then BSWAP check should be removed from here.

[yejun] agree. Actually, there are two methods here in general, the first is 
looking into detail and fix the prerequisite  issue first, the second is to 
skip the prerequisite issue (bswap/iswrite) and finish the main purpose (local 
copy propagation base) first, the issue will be reviewed again when we do 
optimization for a specific kernel. I choose the second one as my direction for 
this patch set.

2. Why !insn.isWrite() is here?
   This is more difficult than the first one. My understanding is that a 
register in selection
   vector should not be replaced. Because, that will bring extra complexity to 
manipulate the
   selection vectors. But the isWrite() checking doesn't cover all the cases. 
Actually,
   nearly all read instructions have one selection source vector, although many 
of them have only
   1 element. For example, the SAMPLE instruction has a source selection vector.

[yejun] I agree !insn.isWrite() is not enough. Actually, I make a very loose 
condition here according to the result of utest. I'm expecting there might be 
more conditions, there might be regression issues with the current condition. 
I'm unable to find all the conditions now, so want the regression issue to 
educate me.

[Yejun] Btw, is there a way to check if it is 'a register in selection vector', 
instead of the opcode check? Thanks.



> > [Yejun] Let's use the following SEL IR as an example, %42 is the same 
> > register but with two different stride in the two IRs. 
> > MOV(1)              %42<2>:D        :       %41<0,1,0>:D
> > MOV(16)           %43<1>:D          :       %42<0,1,0>:D
> This doesn't address my comment. As the comment suggests to compare the 
> GenRegisters' attribute not the virtual register. You can easily found the 
> above two GenRegisters are different.
> 
> [Yejun] My expectation is that %42 should be considered as the same, and to 
> optimize the IR as "mov(16) %43<1>:D, %41<0,1,0>:D". The optimization chance 
> is lost if we compare the width/stride of %42 and fount they are not the same.

The clear logic here is that the assignment at intermedia instruction should 
cover the current replacement's usage. So the clear way to check that is to 
implement the following function:

bool IsSrcCoveredByIntermedia(GenRegister src, int curExecWidth, GenRegister 
intermedia, int intermediaExecWidth) { }

Before we do the above check, another thing we need to take into consideration 
which is the predication and mask of the current instruction and the intermedia 
instruction.

intermedia                src                       replacable
1)noMask = 1               all                        yes
2)GEN_PREDICATE_NORMAL     GEN_PREDICATE_NORMAL       yes
3)GEN_PREDICATE_NORMAL     !GEN_PREDICATE_NORMAL      yes
4)!GEN_PREDICATE_NORMAL    all                        no (could check pred flag 
further, but not worth to do)
[Yejun] thanks, these info are what I expected regression issues will educate 
me. Btw, is there a typo for item 2) and 3), looks that they can be merged into 
one case, while you are listing them as two cases.


So we may change the above function to:
bool IsSrcCoveredByIntermedia(GenRegister src, SelectionInstruction *curInsn, 
GenRegister intermedia, SelectionInstruction *intermediaInsn) { }

Do all the check in this function and give a result whether it's replacable.

[Yejun] agree to encapsulate all the logic into a new function such as 
CanBeReplaced(). I do not use 'cover' here, for the case that src is part of 
intermedia, extra logic is needed for the replacement. It can be added as an 
extended feature in the real optimization work, if necessary.

BTW, as a reviewer, I really love more clear comment which could reduce tons of 
time to get to the point :).

[Yejun] and thanks for your time reviewing and discussing, it really helps the 
code improvement.

Thanks,
Zhigang Gong.

> 
> > 
> > 
> > Thanks,
> > Zhigang Gong.
> > 
> > > 
> > > 
> > > 
> > > 
> > > The other parts LGTM,
> > > 
> > > Thanks,
> > > Zhigang Gong
_______________________________________________
Beignet mailing list
Beignet@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/beignet

Reply via email to