Re: [Mesa-dev] [PATCH 06/23] i965/fs: fix copy/constant propagation regioning checks
Iago Toralwrites: > On Tue, 2016-05-10 at 16:53 -0700, Francisco Jerez wrote: >> Samuel Iglesias Gonsálvez writes: >> >> > From: Iago Toral Quiroga >> > >> > We were not accounting for reg_suboffset in the check for the start >> > of the region. This meant that would allow copy-propagation even if >> > the dst wrote to sub_regoffset 4 and our source read from >> > sub_regoffset 0, which is not correct. This was observed in fp64 code, >> > since there we use reg_suboffset to select the high 32-bit of a double. >> > >> I don't think this paragraph is accurate, copy instructions with >> non-zero destination subreg offset are currently considered partial >> writes and should never have been added to the ACP hash table in the >> first place. > > Right, I think I wrote this patch before the one where I fixed > is_partial_write() to consider any write to subreg_offset > 0 partial. > >> > Also, fs_reg::regs_read() already takes the stride into account, so we >> > should not multiply its result by the stride again. This was making >> > copy-propagation fail to copy-propagate cases that would otherwise be >> > safe to copy-propagate. Again, this was observed in fp64 code, since >> > there we use stride > 1 often. >> > >> > Incidentally, these fixes open up more possibilities for copy propagation >> > which uncovered new bugs in copy-propagation. The folowing patches address >> > each of these new issues. >> >> Oh man, that sucks... >> >> > --- >> > .../drivers/dri/i965/brw_fs_copy_propagation.cpp| 21 >> > + >> > 1 file changed, 13 insertions(+), 8 deletions(-) >> > >> > diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp >> > b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp >> > index 5fae10f..23df877 100644 >> > --- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp >> > +++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp >> > @@ -329,6 +329,15 @@ can_take_stride(fs_inst *inst, unsigned arg, unsigned >> > stride, >> > return true; >> > } >> > >> > +static inline bool >> > +region_match(fs_reg src, unsigned regs_read, >> > + fs_reg dst, unsigned regs_written) >> >> How about 'region_contained_in(dst, regs_write, src, regs_read)'? (I >> personally wouldn't mind 'region_match' but >> 'write_region_contains_read_region' sounds a bit too long for my taste). >> >> > +{ >> > + return src.reg_offset >= dst.reg_offset && >> > + (src.reg_offset + regs_read) <= (dst.reg_offset + regs_written) >> > && >> > + src.subreg_offset >= dst.subreg_offset; >> >> This works under the assumption that src.subreg_offset is strictly less >> than the reg_offset unit -- Which *should* be the case unless we've >> messed up that restriction in some place (we have in the past :P). To >> be on the safe side you could do something like following, if you like: >> >> | return (src.reg_offset * REG_SIZE + src.subreg_offset >= >> | dst.reg_offset * REG_SIZE + dst.subreg_offset) && >> | src.reg_offset + regs_read <= dst.reg_offset + regs_written; > > I understand that even if we discard writes with dst.subreg_offset > 0, > you still want the subreg_offset check here to be safe exactly in that > scenario (since we would not need this for the case I originally wrote > it for). > Yeah, it's up to you but I guess it wouldn't hurt to be extra-paranoid here, and it would probably be sensible to add 'src.file == dst.file && src.nr == dst.nr && ...' to the return expresssion in addition. >> With the above taken into account: >> >> Reviewed-by: Francisco Jerez > > Thanks! > >> > +} >> > + >> > bool >> > fs_visitor::try_copy_propagate(fs_inst *inst, int arg, acp_entry *entry) >> > { >> > @@ -351,10 +360,8 @@ fs_visitor::try_copy_propagate(fs_inst *inst, int >> > arg, acp_entry *entry) >> > /* Bail if inst is reading a range that isn't contained in the range >> > * that entry is writing. >> > */ >> > - if (inst->src[arg].reg_offset < entry->dst.reg_offset || >> > - (inst->src[arg].reg_offset * 32 + inst->src[arg].subreg_offset + >> > -inst->regs_read(arg) * inst->src[arg].stride * 32) > >> > - (entry->dst.reg_offset + entry->regs_written) * 32) >> > + if (!region_match(inst->src[arg], inst->regs_read(arg), >> > + entry->dst, entry->regs_written)) >> >return false; >> > >> > /* we can't generally copy-propagate UD negations because we >> > @@ -554,10 +561,8 @@ fs_visitor::try_constant_propagate(fs_inst *inst, >> > acp_entry *entry) >> >/* Bail if inst is reading a range that isn't contained in the range >> > * that entry is writing. >> > */ >> > - if (inst->src[i].reg_offset < entry->dst.reg_offset || >> > - (inst->src[i].reg_offset * 32 + inst->src[i].subreg_offset + >> > - inst->regs_read(i) * inst->src[i].stride *
Re: [Mesa-dev] [PATCH 06/23] i965/fs: fix copy/constant propagation regioning checks
Iago Toralwrites: > On Tue, 2016-05-10 at 16:53 -0700, Francisco Jerez wrote: >> Samuel Iglesias Gonsálvez writes: >> >> > From: Iago Toral Quiroga >> > >> > We were not accounting for reg_suboffset in the check for the start >> > of the region. This meant that would allow copy-propagation even if >> > the dst wrote to sub_regoffset 4 and our source read from >> > sub_regoffset 0, which is not correct. This was observed in fp64 code, >> > since there we use reg_suboffset to select the high 32-bit of a double. >> > >> I don't think this paragraph is accurate, copy instructions with >> non-zero destination subreg offset are currently considered partial >> writes and should never have been added to the ACP hash table in the >> first place. > > Right, I think I wrote this patch before the one where I fixed > is_partial_write() to consider any write to subreg_offset > 0 partial. > In practice they would be considered partial writes already, but the reason is somewhat obscure -- Writes with subreg_offset != 0 would necessarily fall into three categories: - Strided writes (which are considered partial explicitly). - Contiguous writes which write less than one GRF worth of data (which are considered partial explicitly) - Contiguous writes which straddle two registers (which have been severely limited by the hardware historically). Even though it's unlikely to have fixed pre-existing bugs, for the sake of sanity I'm glad that you fixed is_partial_write() to handle the latter case explicitly (thanks!), especially since the hardware has slowly been lifting restrictions on the ways you can straddle multiple registers: On Gen7 the stupid decompression behaviour that simply shifts the second decompressed portion of the instruction by one GRF would kill you. Gen8 is slightly less stupid and shifts the second decompressed portion by ExecSize/2 components which allows it to handle a subset of straddled writes (only the cases where the written components are balanced between the two registers). AFAIK Gen9 should be the first to support fully unrestricted unbalanced writes. >> > Also, fs_reg::regs_read() already takes the stride into account, so we >> > should not multiply its result by the stride again. This was making >> > copy-propagation fail to copy-propagate cases that would otherwise be >> > safe to copy-propagate. Again, this was observed in fp64 code, since >> > there we use stride > 1 often. >> > >> > Incidentally, these fixes open up more possibilities for copy propagation >> > which uncovered new bugs in copy-propagation. The folowing patches address >> > each of these new issues. >> >> Oh man, that sucks... >> >> > --- >> > .../drivers/dri/i965/brw_fs_copy_propagation.cpp| 21 >> > + >> > 1 file changed, 13 insertions(+), 8 deletions(-) >> > >> > diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp >> > b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp >> > index 5fae10f..23df877 100644 >> > --- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp >> > +++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp >> > @@ -329,6 +329,15 @@ can_take_stride(fs_inst *inst, unsigned arg, unsigned >> > stride, >> > return true; >> > } >> > >> > +static inline bool >> > +region_match(fs_reg src, unsigned regs_read, >> > + fs_reg dst, unsigned regs_written) >> >> How about 'region_contained_in(dst, regs_write, src, regs_read)'? (I >> personally wouldn't mind 'region_match' but >> 'write_region_contains_read_region' sounds a bit too long for my taste). >> >> > +{ >> > + return src.reg_offset >= dst.reg_offset && >> > + (src.reg_offset + regs_read) <= (dst.reg_offset + regs_written) >> > && >> > + src.subreg_offset >= dst.subreg_offset; >> >> This works under the assumption that src.subreg_offset is strictly less >> than the reg_offset unit -- Which *should* be the case unless we've >> messed up that restriction in some place (we have in the past :P). To >> be on the safe side you could do something like following, if you like: >> >> | return (src.reg_offset * REG_SIZE + src.subreg_offset >= >> | dst.reg_offset * REG_SIZE + dst.subreg_offset) && >> | src.reg_offset + regs_read <= dst.reg_offset + regs_written; > > I understand that even if we discard writes with dst.subreg_offset > 0, > you still want the subreg_offset check here to be safe exactly in that > scenario (since we would not need this for the case I originally wrote > it for). > >> With the above taken into account: >> >> Reviewed-by: Francisco Jerez > > Thanks! > >> > +} >> > + >> > bool >> > fs_visitor::try_copy_propagate(fs_inst *inst, int arg, acp_entry *entry) >> > { >> > @@ -351,10 +360,8 @@ fs_visitor::try_copy_propagate(fs_inst *inst, int >> > arg, acp_entry *entry) >> > /* Bail if inst is reading a range that isn't contained
Re: [Mesa-dev] [PATCH 06/23] i965/fs: fix copy/constant propagation regioning checks
On Tue, 2016-05-10 at 16:53 -0700, Francisco Jerez wrote: > Samuel Iglesias Gonsálvezwrites: > > > From: Iago Toral Quiroga > > > > We were not accounting for reg_suboffset in the check for the start > > of the region. This meant that would allow copy-propagation even if > > the dst wrote to sub_regoffset 4 and our source read from > > sub_regoffset 0, which is not correct. This was observed in fp64 code, > > since there we use reg_suboffset to select the high 32-bit of a double. > > > I don't think this paragraph is accurate, copy instructions with > non-zero destination subreg offset are currently considered partial > writes and should never have been added to the ACP hash table in the > first place. Right, I think I wrote this patch before the one where I fixed is_partial_write() to consider any write to subreg_offset > 0 partial. > > Also, fs_reg::regs_read() already takes the stride into account, so we > > should not multiply its result by the stride again. This was making > > copy-propagation fail to copy-propagate cases that would otherwise be > > safe to copy-propagate. Again, this was observed in fp64 code, since > > there we use stride > 1 often. > > > > Incidentally, these fixes open up more possibilities for copy propagation > > which uncovered new bugs in copy-propagation. The folowing patches address > > each of these new issues. > > Oh man, that sucks... > > > --- > > .../drivers/dri/i965/brw_fs_copy_propagation.cpp| 21 > > + > > 1 file changed, 13 insertions(+), 8 deletions(-) > > > > diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp > > b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp > > index 5fae10f..23df877 100644 > > --- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp > > +++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp > > @@ -329,6 +329,15 @@ can_take_stride(fs_inst *inst, unsigned arg, unsigned > > stride, > > return true; > > } > > > > +static inline bool > > +region_match(fs_reg src, unsigned regs_read, > > + fs_reg dst, unsigned regs_written) > > How about 'region_contained_in(dst, regs_write, src, regs_read)'? (I > personally wouldn't mind 'region_match' but > 'write_region_contains_read_region' sounds a bit too long for my taste). > > > +{ > > + return src.reg_offset >= dst.reg_offset && > > + (src.reg_offset + regs_read) <= (dst.reg_offset + regs_written) > > && > > + src.subreg_offset >= dst.subreg_offset; > > This works under the assumption that src.subreg_offset is strictly less > than the reg_offset unit -- Which *should* be the case unless we've > messed up that restriction in some place (we have in the past :P). To > be on the safe side you could do something like following, if you like: > > | return (src.reg_offset * REG_SIZE + src.subreg_offset >= > | dst.reg_offset * REG_SIZE + dst.subreg_offset) && > | src.reg_offset + regs_read <= dst.reg_offset + regs_written; I understand that even if we discard writes with dst.subreg_offset > 0, you still want the subreg_offset check here to be safe exactly in that scenario (since we would not need this for the case I originally wrote it for). > With the above taken into account: > > Reviewed-by: Francisco Jerez Thanks! > > +} > > + > > bool > > fs_visitor::try_copy_propagate(fs_inst *inst, int arg, acp_entry *entry) > > { > > @@ -351,10 +360,8 @@ fs_visitor::try_copy_propagate(fs_inst *inst, int arg, > > acp_entry *entry) > > /* Bail if inst is reading a range that isn't contained in the range > > * that entry is writing. > > */ > > - if (inst->src[arg].reg_offset < entry->dst.reg_offset || > > - (inst->src[arg].reg_offset * 32 + inst->src[arg].subreg_offset + > > -inst->regs_read(arg) * inst->src[arg].stride * 32) > > > - (entry->dst.reg_offset + entry->regs_written) * 32) > > + if (!region_match(inst->src[arg], inst->regs_read(arg), > > + entry->dst, entry->regs_written)) > >return false; > > > > /* we can't generally copy-propagate UD negations because we > > @@ -554,10 +561,8 @@ fs_visitor::try_constant_propagate(fs_inst *inst, > > acp_entry *entry) > >/* Bail if inst is reading a range that isn't contained in the range > > * that entry is writing. > > */ > > - if (inst->src[i].reg_offset < entry->dst.reg_offset || > > - (inst->src[i].reg_offset * 32 + inst->src[i].subreg_offset + > > - inst->regs_read(i) * inst->src[i].stride * 32) > > > - (entry->dst.reg_offset + entry->regs_written) * 32) > > + if (!region_match(inst->src[i], inst->regs_read(i), > > +entry->dst, entry->regs_written)) > > continue; > > > >fs_reg val = entry->src; > > -- > > 2.5.0 > > > > ___ > > mesa-dev mailing list
Re: [Mesa-dev] [PATCH 06/23] i965/fs: fix copy/constant propagation regioning checks
Francisco Jerezwrites: > Samuel Iglesias Gonsálvez writes: > >> From: Iago Toral Quiroga >> >> We were not accounting for reg_suboffset in the check for the start >> of the region. This meant that would allow copy-propagation even if >> the dst wrote to sub_regoffset 4 and our source read from >> sub_regoffset 0, which is not correct. This was observed in fp64 code, >> since there we use reg_suboffset to select the high 32-bit of a double. >> > I don't think this paragraph is accurate, copy instructions with > non-zero destination subreg offset are currently considered partial > writes and should never have been added to the ACP hash table in the > first place. > >> Also, fs_reg::regs_read() already takes the stride into account, so we >> should not multiply its result by the stride again. This was making >> copy-propagation fail to copy-propagate cases that would otherwise be >> safe to copy-propagate. Again, this was observed in fp64 code, since >> there we use stride > 1 often. >> >> Incidentally, these fixes open up more possibilities for copy propagation >> which uncovered new bugs in copy-propagation. The folowing patches address >> each of these new issues. > > Oh man, that sucks... > >> --- >> .../drivers/dri/i965/brw_fs_copy_propagation.cpp| 21 >> + >> 1 file changed, 13 insertions(+), 8 deletions(-) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp >> b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp >> index 5fae10f..23df877 100644 >> --- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp >> +++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp >> @@ -329,6 +329,15 @@ can_take_stride(fs_inst *inst, unsigned arg, unsigned >> stride, >> return true; >> } >> >> +static inline bool >> +region_match(fs_reg src, unsigned regs_read, >> + fs_reg dst, unsigned regs_written) > Forgot to mention that there's no reason to pass the registers by value here, please use 'const fs_reg &' instead. > How about 'region_contained_in(dst, regs_write, src, regs_read)'? (I Oops, in case it wasn't not clear from my sentence above, I didn't intend to suggest using different argument names for this function, I just typoed them -- regs_written sounds fine to me. > personally wouldn't mind 'region_match' but > 'write_region_contains_read_region' sounds a bit too long for my taste). > >> +{ >> + return src.reg_offset >= dst.reg_offset && >> + (src.reg_offset + regs_read) <= (dst.reg_offset + regs_written) && >> + src.subreg_offset >= dst.subreg_offset; > > This works under the assumption that src.subreg_offset is strictly less > than the reg_offset unit -- Which *should* be the case unless we've > messed up that restriction in some place (we have in the past :P). To > be on the safe side you could do something like following, if you like: > > | return (src.reg_offset * REG_SIZE + src.subreg_offset >= > | dst.reg_offset * REG_SIZE + dst.subreg_offset) && > | src.reg_offset + regs_read <= dst.reg_offset + regs_written; > > With the above taken into account: > > Reviewed-by: Francisco Jerez > >> +} >> + >> bool >> fs_visitor::try_copy_propagate(fs_inst *inst, int arg, acp_entry *entry) >> { >> @@ -351,10 +360,8 @@ fs_visitor::try_copy_propagate(fs_inst *inst, int arg, >> acp_entry *entry) >> /* Bail if inst is reading a range that isn't contained in the range >> * that entry is writing. >> */ >> - if (inst->src[arg].reg_offset < entry->dst.reg_offset || >> - (inst->src[arg].reg_offset * 32 + inst->src[arg].subreg_offset + >> -inst->regs_read(arg) * inst->src[arg].stride * 32) > >> - (entry->dst.reg_offset + entry->regs_written) * 32) >> + if (!region_match(inst->src[arg], inst->regs_read(arg), >> + entry->dst, entry->regs_written)) >>return false; >> >> /* we can't generally copy-propagate UD negations because we >> @@ -554,10 +561,8 @@ fs_visitor::try_constant_propagate(fs_inst *inst, >> acp_entry *entry) >>/* Bail if inst is reading a range that isn't contained in the range >> * that entry is writing. >> */ >> - if (inst->src[i].reg_offset < entry->dst.reg_offset || >> - (inst->src[i].reg_offset * 32 + inst->src[i].subreg_offset + >> - inst->regs_read(i) * inst->src[i].stride * 32) > >> - (entry->dst.reg_offset + entry->regs_written) * 32) >> + if (!region_match(inst->src[i], inst->regs_read(i), >> +entry->dst, entry->regs_written)) >> continue; >> >>fs_reg val = entry->src; >> -- >> 2.5.0 >> >> ___ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev signature.asc Description: PGP signature
Re: [Mesa-dev] [PATCH 06/23] i965/fs: fix copy/constant propagation regioning checks
Samuel Iglesias Gonsálvezwrites: > From: Iago Toral Quiroga > > We were not accounting for reg_suboffset in the check for the start > of the region. This meant that would allow copy-propagation even if > the dst wrote to sub_regoffset 4 and our source read from > sub_regoffset 0, which is not correct. This was observed in fp64 code, > since there we use reg_suboffset to select the high 32-bit of a double. > I don't think this paragraph is accurate, copy instructions with non-zero destination subreg offset are currently considered partial writes and should never have been added to the ACP hash table in the first place. > Also, fs_reg::regs_read() already takes the stride into account, so we > should not multiply its result by the stride again. This was making > copy-propagation fail to copy-propagate cases that would otherwise be > safe to copy-propagate. Again, this was observed in fp64 code, since > there we use stride > 1 often. > > Incidentally, these fixes open up more possibilities for copy propagation > which uncovered new bugs in copy-propagation. The folowing patches address > each of these new issues. Oh man, that sucks... > --- > .../drivers/dri/i965/brw_fs_copy_propagation.cpp| 21 > + > 1 file changed, 13 insertions(+), 8 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp > b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp > index 5fae10f..23df877 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp > +++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp > @@ -329,6 +329,15 @@ can_take_stride(fs_inst *inst, unsigned arg, unsigned > stride, > return true; > } > > +static inline bool > +region_match(fs_reg src, unsigned regs_read, > + fs_reg dst, unsigned regs_written) How about 'region_contained_in(dst, regs_write, src, regs_read)'? (I personally wouldn't mind 'region_match' but 'write_region_contains_read_region' sounds a bit too long for my taste). > +{ > + return src.reg_offset >= dst.reg_offset && > + (src.reg_offset + regs_read) <= (dst.reg_offset + regs_written) && > + src.subreg_offset >= dst.subreg_offset; This works under the assumption that src.subreg_offset is strictly less than the reg_offset unit -- Which *should* be the case unless we've messed up that restriction in some place (we have in the past :P). To be on the safe side you could do something like following, if you like: | return (src.reg_offset * REG_SIZE + src.subreg_offset >= | dst.reg_offset * REG_SIZE + dst.subreg_offset) && | src.reg_offset + regs_read <= dst.reg_offset + regs_written; With the above taken into account: Reviewed-by: Francisco Jerez > +} > + > bool > fs_visitor::try_copy_propagate(fs_inst *inst, int arg, acp_entry *entry) > { > @@ -351,10 +360,8 @@ fs_visitor::try_copy_propagate(fs_inst *inst, int arg, > acp_entry *entry) > /* Bail if inst is reading a range that isn't contained in the range > * that entry is writing. > */ > - if (inst->src[arg].reg_offset < entry->dst.reg_offset || > - (inst->src[arg].reg_offset * 32 + inst->src[arg].subreg_offset + > -inst->regs_read(arg) * inst->src[arg].stride * 32) > > - (entry->dst.reg_offset + entry->regs_written) * 32) > + if (!region_match(inst->src[arg], inst->regs_read(arg), > + entry->dst, entry->regs_written)) >return false; > > /* we can't generally copy-propagate UD negations because we > @@ -554,10 +561,8 @@ fs_visitor::try_constant_propagate(fs_inst *inst, > acp_entry *entry) >/* Bail if inst is reading a range that isn't contained in the range > * that entry is writing. > */ > - if (inst->src[i].reg_offset < entry->dst.reg_offset || > - (inst->src[i].reg_offset * 32 + inst->src[i].subreg_offset + > - inst->regs_read(i) * inst->src[i].stride * 32) > > - (entry->dst.reg_offset + entry->regs_written) * 32) > + if (!region_match(inst->src[i], inst->regs_read(i), > +entry->dst, entry->regs_written)) > continue; > >fs_reg val = entry->src; > -- > 2.5.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 06/23] i965/fs: fix copy/constant propagation regioning checks
On Tue, 2016-05-03 at 17:28 -0700, Jordan Justen wrote: > On 2016-05-03 05:21:55, Samuel Iglesias Gonsálvez wrote: > > From: Iago Toral Quiroga> > > > We were not accounting for reg_suboffset in the check for the start > > of the region. This meant that would allow copy-propagation even if > > the dst wrote to sub_regoffset 4 and our source read from > > sub_regoffset 0, which is not correct. This was observed in fp64 code, > > since there we use reg_suboffset to select the high 32-bit of a double. > > > > Also, fs_reg::regs_read() already takes the stride into account, so we > > should not multiply its result by the stride again. This was making > > copy-propagation fail to copy-propagate cases that would otherwise be > > safe to copy-propagate. Again, this was observed in fp64 code, since > > there we use stride > 1 often. > > > > Incidentally, these fixes open up more possibilities for copy propagation > > which uncovered new bugs in copy-propagation. The folowing patches address > > each of these new issues. > > Should this be moved after those fixes? Yeah, I think that would be better. > -Jordan > > > --- > > .../drivers/dri/i965/brw_fs_copy_propagation.cpp| 21 > > + > > 1 file changed, 13 insertions(+), 8 deletions(-) > > > > diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp > > b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp > > index 5fae10f..23df877 100644 > > --- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp > > +++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp > > @@ -329,6 +329,15 @@ can_take_stride(fs_inst *inst, unsigned arg, unsigned > > stride, > > return true; > > } > > > > +static inline bool > > +region_match(fs_reg src, unsigned regs_read, > > + fs_reg dst, unsigned regs_written) > > +{ > > + return src.reg_offset >= dst.reg_offset && > > + (src.reg_offset + regs_read) <= (dst.reg_offset + regs_written) > > && > > + src.subreg_offset >= dst.subreg_offset; > > +} > > + > > bool > > fs_visitor::try_copy_propagate(fs_inst *inst, int arg, acp_entry *entry) > > { > > @@ -351,10 +360,8 @@ fs_visitor::try_copy_propagate(fs_inst *inst, int arg, > > acp_entry *entry) > > /* Bail if inst is reading a range that isn't contained in the range > > * that entry is writing. > > */ > > - if (inst->src[arg].reg_offset < entry->dst.reg_offset || > > - (inst->src[arg].reg_offset * 32 + inst->src[arg].subreg_offset + > > -inst->regs_read(arg) * inst->src[arg].stride * 32) > > > - (entry->dst.reg_offset + entry->regs_written) * 32) > > + if (!region_match(inst->src[arg], inst->regs_read(arg), > > + entry->dst, entry->regs_written)) > >return false; > > > > /* we can't generally copy-propagate UD negations because we > > @@ -554,10 +561,8 @@ fs_visitor::try_constant_propagate(fs_inst *inst, > > acp_entry *entry) > >/* Bail if inst is reading a range that isn't contained in the range > > * that entry is writing. > > */ > > - if (inst->src[i].reg_offset < entry->dst.reg_offset || > > - (inst->src[i].reg_offset * 32 + inst->src[i].subreg_offset + > > - inst->regs_read(i) * inst->src[i].stride * 32) > > > - (entry->dst.reg_offset + entry->regs_written) * 32) > > + if (!region_match(inst->src[i], inst->regs_read(i), > > +entry->dst, entry->regs_written)) > > continue; > > > >fs_reg val = entry->src; > > -- > > 2.5.0 > > > > ___ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 06/23] i965/fs: fix copy/constant propagation regioning checks
On Tue, 2016-05-03 at 17:37 -0700, Jason Ekstrand wrote: > > > On Tue, May 3, 2016 at 5:21 AM, Samuel Iglesias Gonsálvez >wrote: > From: Iago Toral Quiroga > > We were not accounting for reg_suboffset in the check for the > start > of the region. This meant that would allow copy-propagation > even if > the dst wrote to sub_regoffset 4 and our source read from > sub_regoffset 0, which is not correct. This was observed in > fp64 code, > since there we use reg_suboffset to select the high 32-bit of > a double. > > > I'm not sure if copy-prop can handle the case where you write to > sub_regoffset != 0 anyway. Maybe we should just bail on that? Right, we found a few issues and some of the patches in this series address the problems we found. Maybe there are still a few other cases that we missed but my thinking was that we probably wanted to fix the pass for subreg_offset != 0 anyway and if we actually missed something we could just fix it up when it shows up as a bug somewhere. That said, we can of course simplify things and just bail here, we would lose some optimizations but we might not need some of the other patches in the series... > Also, fs_reg::regs_read() already takes the stride into > account, so we > should not multiply its result by the stride again. This was > making > copy-propagation fail to copy-propagate cases that would > otherwise be > safe to copy-propagate. Again, this was observed in fp64 code, > since > there we use stride > 1 often. > > Incidentally, these fixes open up more possibilities for copy > propagation > which uncovered new bugs in copy-propagation. The folowing > patches address > each of these new issues. > --- > .../drivers/dri/i965/brw_fs_copy_propagation.cpp| 21 > + > 1 file changed, 13 insertions(+), 8 deletions(-) > > diff --git > a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp > b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp > index 5fae10f..23df877 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp > +++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp > @@ -329,6 +329,15 @@ can_take_stride(fs_inst *inst, unsigned > arg, unsigned stride, > return true; > } > > +static inline bool > +region_match(fs_reg src, unsigned regs_read, > + fs_reg dst, unsigned regs_written) > > > I don't like the name. It should say something about containment with > perhaps a comment to go along with it. Should we assert that they > have the same file and number? /** * Check that the register region read by src [src.reg_offset, * src.reg_offset + regs_read] is contained inside the register * region written by dst [dst.reg_offset, dst.reg_offset + regs_written] * Both src and dst must have the same register number and file. */ static inline bool write_region_contains_read_region(...) And yes, we should definitely assert on the file and register number. > +{ > + return src.reg_offset >= dst.reg_offset && > + (src.reg_offset + regs_read) <= (dst.reg_offset + > regs_written) && > + src.subreg_offset >= dst.subreg_offset; > +} > + > bool > fs_visitor::try_copy_propagate(fs_inst *inst, int arg, > acp_entry *entry) > { > @@ -351,10 +360,8 @@ fs_visitor::try_copy_propagate(fs_inst > *inst, int arg, acp_entry *entry) > /* Bail if inst is reading a range that isn't contained in > the range > * that entry is writing. > */ > - if (inst->src[arg].reg_offset < entry->dst.reg_offset || > - (inst->src[arg].reg_offset * 32 + > inst->src[arg].subreg_offset + > -inst->regs_read(arg) * inst->src[arg].stride * 32) > > - (entry->dst.reg_offset + entry->regs_written) * 32) > + if (!region_match(inst->src[arg], inst->regs_read(arg), > + entry->dst, entry->regs_written)) >return false; > > /* we can't generally copy-propagate UD negations because > we > @@ -554,10 +561,8 @@ > fs_visitor::try_constant_propagate(fs_inst *inst, acp_entry > *entry) >/* Bail if inst is reading a range that isn't contained > in the range > * that entry is writing. > */ > - if (inst->src[i].reg_offset < entry->dst.reg_offset || > - (inst->src[i].reg_offset * 32 + > inst->src[i].subreg_offset + > -
Re: [Mesa-dev] [PATCH 06/23] i965/fs: fix copy/constant propagation regioning checks
On Tue, May 3, 2016 at 5:21 AM, Samuel Iglesias Gonsálvez < sigles...@igalia.com> wrote: > From: Iago Toral Quiroga> > We were not accounting for reg_suboffset in the check for the start > of the region. This meant that would allow copy-propagation even if > the dst wrote to sub_regoffset 4 and our source read from > sub_regoffset 0, which is not correct. This was observed in fp64 code, > since there we use reg_suboffset to select the high 32-bit of a double. > I'm not sure if copy-prop can handle the case where you write to sub_regoffset != 0 anyway. Maybe we should just bail on that? > Also, fs_reg::regs_read() already takes the stride into account, so we > should not multiply its result by the stride again. This was making > copy-propagation fail to copy-propagate cases that would otherwise be > safe to copy-propagate. Again, this was observed in fp64 code, since > there we use stride > 1 often. > > Incidentally, these fixes open up more possibilities for copy propagation > which uncovered new bugs in copy-propagation. The folowing patches address > each of these new issues. > --- > .../drivers/dri/i965/brw_fs_copy_propagation.cpp| 21 > + > 1 file changed, 13 insertions(+), 8 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp > b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp > index 5fae10f..23df877 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp > +++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp > @@ -329,6 +329,15 @@ can_take_stride(fs_inst *inst, unsigned arg, unsigned > stride, > return true; > } > > +static inline bool > +region_match(fs_reg src, unsigned regs_read, > + fs_reg dst, unsigned regs_written) > I don't like the name. It should say something about containment with perhaps a comment to go along with it. Should we assert that they have the same file and number? > +{ > + return src.reg_offset >= dst.reg_offset && > + (src.reg_offset + regs_read) <= (dst.reg_offset + regs_written) > && > + src.subreg_offset >= dst.subreg_offset; > +} > + > bool > fs_visitor::try_copy_propagate(fs_inst *inst, int arg, acp_entry *entry) > { > @@ -351,10 +360,8 @@ fs_visitor::try_copy_propagate(fs_inst *inst, int > arg, acp_entry *entry) > /* Bail if inst is reading a range that isn't contained in the range > * that entry is writing. > */ > - if (inst->src[arg].reg_offset < entry->dst.reg_offset || > - (inst->src[arg].reg_offset * 32 + inst->src[arg].subreg_offset + > -inst->regs_read(arg) * inst->src[arg].stride * 32) > > - (entry->dst.reg_offset + entry->regs_written) * 32) > + if (!region_match(inst->src[arg], inst->regs_read(arg), > + entry->dst, entry->regs_written)) >return false; > > /* we can't generally copy-propagate UD negations because we > @@ -554,10 +561,8 @@ fs_visitor::try_constant_propagate(fs_inst *inst, > acp_entry *entry) >/* Bail if inst is reading a range that isn't contained in the range > * that entry is writing. > */ > - if (inst->src[i].reg_offset < entry->dst.reg_offset || > - (inst->src[i].reg_offset * 32 + inst->src[i].subreg_offset + > - inst->regs_read(i) * inst->src[i].stride * 32) > > - (entry->dst.reg_offset + entry->regs_written) * 32) > + if (!region_match(inst->src[i], inst->regs_read(i), > +entry->dst, entry->regs_written)) > continue; > >fs_reg val = entry->src; > -- > 2.5.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 06/23] i965/fs: fix copy/constant propagation regioning checks
On 2016-05-03 05:21:55, Samuel Iglesias Gonsálvez wrote: > From: Iago Toral Quiroga> > We were not accounting for reg_suboffset in the check for the start > of the region. This meant that would allow copy-propagation even if > the dst wrote to sub_regoffset 4 and our source read from > sub_regoffset 0, which is not correct. This was observed in fp64 code, > since there we use reg_suboffset to select the high 32-bit of a double. > > Also, fs_reg::regs_read() already takes the stride into account, so we > should not multiply its result by the stride again. This was making > copy-propagation fail to copy-propagate cases that would otherwise be > safe to copy-propagate. Again, this was observed in fp64 code, since > there we use stride > 1 often. > > Incidentally, these fixes open up more possibilities for copy propagation > which uncovered new bugs in copy-propagation. The folowing patches address > each of these new issues. Should this be moved after those fixes? -Jordan > --- > .../drivers/dri/i965/brw_fs_copy_propagation.cpp| 21 > + > 1 file changed, 13 insertions(+), 8 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp > b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp > index 5fae10f..23df877 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp > +++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp > @@ -329,6 +329,15 @@ can_take_stride(fs_inst *inst, unsigned arg, unsigned > stride, > return true; > } > > +static inline bool > +region_match(fs_reg src, unsigned regs_read, > + fs_reg dst, unsigned regs_written) > +{ > + return src.reg_offset >= dst.reg_offset && > + (src.reg_offset + regs_read) <= (dst.reg_offset + regs_written) && > + src.subreg_offset >= dst.subreg_offset; > +} > + > bool > fs_visitor::try_copy_propagate(fs_inst *inst, int arg, acp_entry *entry) > { > @@ -351,10 +360,8 @@ fs_visitor::try_copy_propagate(fs_inst *inst, int arg, > acp_entry *entry) > /* Bail if inst is reading a range that isn't contained in the range > * that entry is writing. > */ > - if (inst->src[arg].reg_offset < entry->dst.reg_offset || > - (inst->src[arg].reg_offset * 32 + inst->src[arg].subreg_offset + > -inst->regs_read(arg) * inst->src[arg].stride * 32) > > - (entry->dst.reg_offset + entry->regs_written) * 32) > + if (!region_match(inst->src[arg], inst->regs_read(arg), > + entry->dst, entry->regs_written)) >return false; > > /* we can't generally copy-propagate UD negations because we > @@ -554,10 +561,8 @@ fs_visitor::try_constant_propagate(fs_inst *inst, > acp_entry *entry) >/* Bail if inst is reading a range that isn't contained in the range > * that entry is writing. > */ > - if (inst->src[i].reg_offset < entry->dst.reg_offset || > - (inst->src[i].reg_offset * 32 + inst->src[i].subreg_offset + > - inst->regs_read(i) * inst->src[i].stride * 32) > > - (entry->dst.reg_offset + entry->regs_written) * 32) > + if (!region_match(inst->src[i], inst->regs_read(i), > +entry->dst, entry->regs_written)) > continue; > >fs_reg val = entry->src; > -- > 2.5.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 06/23] i965/fs: fix copy/constant propagation regioning checks
From: Iago Toral QuirogaWe were not accounting for reg_suboffset in the check for the start of the region. This meant that would allow copy-propagation even if the dst wrote to sub_regoffset 4 and our source read from sub_regoffset 0, which is not correct. This was observed in fp64 code, since there we use reg_suboffset to select the high 32-bit of a double. Also, fs_reg::regs_read() already takes the stride into account, so we should not multiply its result by the stride again. This was making copy-propagation fail to copy-propagate cases that would otherwise be safe to copy-propagate. Again, this was observed in fp64 code, since there we use stride > 1 often. Incidentally, these fixes open up more possibilities for copy propagation which uncovered new bugs in copy-propagation. The folowing patches address each of these new issues. --- .../drivers/dri/i965/brw_fs_copy_propagation.cpp| 21 + 1 file changed, 13 insertions(+), 8 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp index 5fae10f..23df877 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp @@ -329,6 +329,15 @@ can_take_stride(fs_inst *inst, unsigned arg, unsigned stride, return true; } +static inline bool +region_match(fs_reg src, unsigned regs_read, + fs_reg dst, unsigned regs_written) +{ + return src.reg_offset >= dst.reg_offset && + (src.reg_offset + regs_read) <= (dst.reg_offset + regs_written) && + src.subreg_offset >= dst.subreg_offset; +} + bool fs_visitor::try_copy_propagate(fs_inst *inst, int arg, acp_entry *entry) { @@ -351,10 +360,8 @@ fs_visitor::try_copy_propagate(fs_inst *inst, int arg, acp_entry *entry) /* Bail if inst is reading a range that isn't contained in the range * that entry is writing. */ - if (inst->src[arg].reg_offset < entry->dst.reg_offset || - (inst->src[arg].reg_offset * 32 + inst->src[arg].subreg_offset + -inst->regs_read(arg) * inst->src[arg].stride * 32) > - (entry->dst.reg_offset + entry->regs_written) * 32) + if (!region_match(inst->src[arg], inst->regs_read(arg), + entry->dst, entry->regs_written)) return false; /* we can't generally copy-propagate UD negations because we @@ -554,10 +561,8 @@ fs_visitor::try_constant_propagate(fs_inst *inst, acp_entry *entry) /* Bail if inst is reading a range that isn't contained in the range * that entry is writing. */ - if (inst->src[i].reg_offset < entry->dst.reg_offset || - (inst->src[i].reg_offset * 32 + inst->src[i].subreg_offset + - inst->regs_read(i) * inst->src[i].stride * 32) > - (entry->dst.reg_offset + entry->regs_written) * 32) + if (!region_match(inst->src[i], inst->regs_read(i), +entry->dst, entry->regs_written)) continue; fs_reg val = entry->src; -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev