Re: [Beignet] [PATCH V4] backend: add global immediate optimization

2017-07-04 Thread Wang, Rander
There are many difference between D and UD.
It is ugly to handle them in one if

-Original Message-
From: Yang, Rong R 
Sent: Tuesday, July 4, 2017 7:42 AM
To: Wang, Rander <rander.w...@intel.com>; inte...@intelfx.name; 
beignet@lists.freedesktop.org
Cc: Song, Ruiling <ruiling.s...@intel.com>
Subject: RE: [Beignet] [PATCH V4] backend: add global immediate optimization

GEN is support mixed type instructions, mixed UD and UW. For example, UD * UW.
How about handle the U/UD in one if branch?

> -Original Message-
> From: Beignet [mailto:beignet-boun...@lists.freedesktop.org] On Behalf 
> Of Wang, Rander
> Sent: Monday, July 3, 2017 9:33
> To: inte...@intelfx.name; beignet@lists.freedesktop.org
> Cc: Song, Ruiling <ruiling.s...@intel.com>
> Subject: Re: [Beignet] [PATCH V4] backend: add global immediate 
> optimization
> 
> For D + UD,   D is considered as UD by HW.
> 
> -Original Message-
> From: Ivan Shapovalov [mailto:inte...@intelfx.name]
> Sent: Saturday, July 1, 2017 2:26 AM
> To: Wang, Rander <rander.w...@intel.com>; 
> beignet@lists.freedesktop.org
> Cc: Song, Ruiling <ruiling.s...@intel.com>
> Subject: Re: [Beignet] [PATCH V4] backend: add global immediate 
> optimization
> 
> On 2017-06-30 at 15:36 +0300, Ivan Shapovalov wrote:
> > On 2017-06-30 at 01:46 +, Wang, Rander wrote:
> > > Hi,
> > >
> > >   The abs of UD has to be done if it is encoded in instruction no 
> > > matter it make sense or not.
> > > And I have discussed with my collage and refine it.
> > > First we inspect the HW behavior of ABS(UD), -(UD) and find 
> > > that
> > > ABS(UD) = UD,
> > > -(UD) = the result of -(UD) on CPU.
> > >
> > >   So the abs calculation can be removed and this will make it 
> > > compiled pass.
> > >
> > > Rander
> >
> > Hi,
> >
> > OK, but what about reading from .value.ud if the corresponding .type 
> > is not GEN_TYPE_UD? Is this a concern? Which operand type 
> > combinations are possible?
> >
> 
> I mean, due to an || in the conditional it looks like it is possible 
> for either of the operands to not be a GEN_TYPE_D. Suppose the first 
> operand is a signed dword (GEN_TYPE_D) that holds a negative value and 
> has the ABS flag. In this case the new code will yield a significantly wrong 
> result. Is this possible?
> 
> --
> Ivan Shapovalov / intelfx /
> ___
> Beignet mailing list
> Beignet@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/beignet
___
Beignet mailing list
Beignet@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/beignet


Re: [Beignet] [PATCH V4] backend: add global immediate optimization

2017-07-03 Thread Yang, Rong R
GEN is support mixed type instructions, mixed UD and UW. For example, UD * UW.
How about handle the U/UD in one if branch?

> -Original Message-
> From: Beignet [mailto:beignet-boun...@lists.freedesktop.org] On Behalf Of
> Wang, Rander
> Sent: Monday, July 3, 2017 9:33
> To: inte...@intelfx.name; beignet@lists.freedesktop.org
> Cc: Song, Ruiling <ruiling.s...@intel.com>
> Subject: Re: [Beignet] [PATCH V4] backend: add global immediate
> optimization
> 
> For D + UD,   D is considered as UD by HW.
> 
> -Original Message-
> From: Ivan Shapovalov [mailto:inte...@intelfx.name]
> Sent: Saturday, July 1, 2017 2:26 AM
> To: Wang, Rander <rander.w...@intel.com>; beignet@lists.freedesktop.org
> Cc: Song, Ruiling <ruiling.s...@intel.com>
> Subject: Re: [Beignet] [PATCH V4] backend: add global immediate
> optimization
> 
> On 2017-06-30 at 15:36 +0300, Ivan Shapovalov wrote:
> > On 2017-06-30 at 01:46 +, Wang, Rander wrote:
> > > Hi,
> > >
> > >   The abs of UD has to be done if it is encoded in instruction no
> > > matter it make sense or not.
> > > And I have discussed with my collage and refine it.
> > > First we inspect the HW behavior of ABS(UD), -(UD) and find that
> > > ABS(UD) = UD,
> > > -(UD) = the result of -(UD) on CPU.
> > >
> > >   So the abs calculation can be removed and this will make it
> > > compiled pass.
> > >
> > > Rander
> >
> > Hi,
> >
> > OK, but what about reading from .value.ud if the corresponding .type
> > is not GEN_TYPE_UD? Is this a concern? Which operand type combinations
> > are possible?
> >
> 
> I mean, due to an || in the conditional it looks like it is possible for 
> either of
> the operands to not be a GEN_TYPE_D. Suppose the first operand is a signed
> dword (GEN_TYPE_D) that holds a negative value and has the ABS flag. In this
> case the new code will yield a significantly wrong result. Is this possible?
> 
> --
> Ivan Shapovalov / intelfx /
> ___
> Beignet mailing list
> Beignet@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/beignet
___
Beignet mailing list
Beignet@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/beignet


Re: [Beignet] [PATCH V4] backend: add global immediate optimization

2017-07-02 Thread Wang, Rander
For D + UD,   D is considered as UD by HW.

-Original Message-
From: Ivan Shapovalov [mailto:inte...@intelfx.name] 
Sent: Saturday, July 1, 2017 2:26 AM
To: Wang, Rander <rander.w...@intel.com>; beignet@lists.freedesktop.org
Cc: Song, Ruiling <ruiling.s...@intel.com>
Subject: Re: [Beignet] [PATCH V4] backend: add global immediate optimization

On 2017-06-30 at 15:36 +0300, Ivan Shapovalov wrote:
> On 2017-06-30 at 01:46 +, Wang, Rander wrote:
> > Hi,
> > 
> > The abs of UD has to be done if it is encoded in instruction no 
> > matter it make sense or not.
> > And I have discussed with my collage and refine it.
> > First we inspect the HW behavior of ABS(UD), -(UD) and find that
> > ABS(UD) = UD,
> > -(UD) = the result of -(UD) on CPU.
> > 
> > So the abs calculation can be removed and this will make it 
> > compiled pass.
> > 
> > Rander
> 
> Hi,
> 
> OK, but what about reading from .value.ud if the corresponding .type 
> is not GEN_TYPE_UD? Is this a concern? Which operand type combinations 
> are possible?
> 

I mean, due to an || in the conditional it looks like it is possible for either 
of the operands to not be a GEN_TYPE_D. Suppose the first operand is a signed 
dword (GEN_TYPE_D) that holds a negative value and has the ABS flag. In this 
case the new code will yield a significantly wrong result. Is this possible?

--
Ivan Shapovalov / intelfx /
___
Beignet mailing list
Beignet@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/beignet


Re: [Beignet] [PATCH V4] backend: add global immediate optimization

2017-06-30 Thread Ivan Shapovalov
On 2017-06-30 at 15:36 +0300, Ivan Shapovalov wrote:
> On 2017-06-30 at 01:46 +, Wang, Rander wrote:
> > Hi,
> > 
> > The abs of UD has to be done if it is encoded in instruction no
> > matter it make sense or not.
> > And I have discussed with my collage and refine it.
> > First we inspect the HW behavior of ABS(UD), -(UD) and find
> > that
> > ABS(UD) = UD,
> > -(UD) = the result of -(UD) on CPU.
> > 
> > So the abs calculation can be removed and this will make it
> > compiled pass.
> > 
> > Rander 
> 
> Hi,
> 
> OK, but what about reading from .value.ud if the corresponding .type
> is
> not GEN_TYPE_UD? Is this a concern? Which operand type combinations
> are
> possible?
> 

I mean, due to an || in the conditional it looks like it is possible
for either of the operands to not be a GEN_TYPE_D. Suppose the first
operand is a signed dword (GEN_TYPE_D) that holds a negative value and
has the ABS flag. In this case the new code will yield a significantly
wrong result. Is this possible?

-- 
Ivan Shapovalov / intelfx /

signature.asc
Description: This is a digitally signed message part
___
Beignet mailing list
Beignet@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/beignet


Re: [Beignet] [PATCH V4] backend: add global immediate optimization

2017-06-30 Thread Ivan Shapovalov
On 2017-06-30 at 01:46 +, Wang, Rander wrote:
> Hi,
> 
>   The abs of UD has to be done if it is encoded in instruction no
> matter it make sense or not.
> And I have discussed with my collage and refine it.
> First we inspect the HW behavior of ABS(UD), -(UD) and find that
> ABS(UD) = UD,
> -(UD) = the result of -(UD) on CPU.
> 
>   So the abs calculation can be removed and this will make it
> compiled pass.
> 
> Rander 

Hi,

OK, but what about reading from .value.ud if the corresponding .type is
not GEN_TYPE_UD? Is this a concern? Which operand type combinations are
possible?

-- 
Ivan Shapovalov / intelfx /

> 
> -Original Message-
> From: Ivan Shapovalov [mailto:inte...@intelfx.name] 
> Sent: Wednesday, June 28, 2017 8:54 AM
> To: Wang, Rander <rander.w...@intel.com>; beignet@lists.freedesktop.o
> rg
> Cc: Song, Ruiling <ruiling.s...@intel.com>
> Subject: Re: [Beignet] [PATCH V4] backend: add global immediate
> optimization
> 
> On 2017-06-14 at 13:55 +0800, rander.wanga wrote:
> > there are some global immediates in global var list of
> > LLVM.
> > these imm can be integrated in instructions. for 
> > compiler_global_immediate_optimized test
> > in utest, there are two global immediates:
> > L0:
> > MOV(1)  %42<0>:UD   :   0x0:UD
> > MOV(1)  %43<0>:UD   :   0x30:UD
> > 
> > used by:
> > ADD(16) %49<1>:D:   %42<0,1,0>:
> > D 
> >%48<8,8,1>:D
> > ADD(16) %54<1>:D:   %43<0,1,0>:
> > D 
> >%53<8,8,1>:D
> > 
> > it can be
> > ADD(16) %49<1>:D:   %48<8,8,1>:D   
> > 0x
> > 0:UD
> > ADD(16) %54<1>:D:   %53<8,8,1>:D   
> > 0x
> > 30:UD
> > 
> > Then the MOV can be removed. And after this optimization, ADD 0
> > can 
> > be change
> > to MOV, then local copy propagation can be done.
> > 
> > V2: (1) add environment variable to enable/disable the
> > optimization
> > (2) refine the architecture of imm optimization, inherit
> > from 
> > global
> > optimizer not local block optimizer
> > 
> > V3: merge with latest master driver
> > 
> > V4: (1)refine some type errors
> > (2)remove UD/D check for no need
> > (3)refine imm calculate for UD/D
> > 
> > Signed-off-by: rander.wang 
> > ---
> >  .../src/backend/gen_insn_selection_optimize.cpp| 367
> > +++--
> >  1 file changed, 342 insertions(+), 25 deletions(-)
> > 
> > diff --git a/backend/src/backend/gen_insn_selection_optimize.cpp
> > b/backend/src/backend/gen_insn_selection_optimize.cpp
> > index 07547ec..eb93a20 100644
> > --- a/backend/src/backend/gen_insn_selection_optimize.cpp
> > +++ b/backend/src/backend/gen_insn_selection_optimize.cpp
> > @@ -40,6 +40,33 @@ namespace gbe
> >  return elements;
> >}
> >  
> > +  class ReplaceInfo
> > +  {
> > +  public:
> > +ReplaceInfo(SelectionInstruction ,
> > +const GenRegister ,
> > +const GenRegister ) : insn(insn),
> > intermedia(intermedia), replacement(replacement)
> > +{
> > +  assert(insn.opcode == SEL_OP_MOV || insn.opcode ==
> > SEL_OP_ADD);
> > +  assert(&(insn.dst(0)) == );
> > +  this->elements = CalculateElements(intermedia,
> > insn.state.execWidth);
> > +  replacementOverwritten = false;
> > +}
> > +~ReplaceInfo()
> > +{
> > +  this->toBeReplaceds.clear();
> > +}
> > +
> > +SelectionInstruction 
> > +const GenRegister 
> > +uint32_t elements;
> > +const GenRegister 
> > +set toBeReplaceds;
> > +set<SelectionInstruction*> toBeReplacedInsns;
> > +bool replacementOverwritten;
> > +GBE_CLASS(ReplaceInfo);
> > +  };
> > +
> >class SelOptimizer
> >{
> >public:
> > @@ -66,32 +93,7 @@ namespace gbe
> >  
> >private:
> >  // local copy propagation
> > -class ReplaceInfo
> > -{
> > -public:
> > -  ReplaceInfo(SelectionInstruction& insn,
> > -  const GenRegister& intermedia,
> > -  const GenRegister

Re: [Beignet] [PATCH V4] backend: add global immediate optimization

2017-06-29 Thread Wang, Rander
Hi,

The abs of UD has to be done if it is encoded in instruction no matter 
it make sense or not.
And I have discussed with my collage and refine it.
First we inspect the HW behavior of ABS(UD), -(UD) and find that ABS(UD) = 
UD,
-(UD) = the result of -(UD) on CPU.

So the abs calculation can be removed and this will make it compiled 
pass.

Rander 

-Original Message-
From: Ivan Shapovalov [mailto:inte...@intelfx.name] 
Sent: Wednesday, June 28, 2017 8:54 AM
To: Wang, Rander <rander.w...@intel.com>; beignet@lists.freedesktop.org
Cc: Song, Ruiling <ruiling.s...@intel.com>
Subject: Re: [Beignet] [PATCH V4] backend: add global immediate optimization

On 2017-06-14 at 13:55 +0800, rander.wanga wrote:
> there are some global immediates in global var list of LLVM.
> these imm can be integrated in instructions. for 
> compiler_global_immediate_optimized test
> in utest, there are two global immediates:
> L0:
> MOV(1)  %42<0>:UD   :   0x0:UD
> MOV(1)  %43<0>:UD   :   0x30:UD
> 
> used by:
> ADD(16) %49<1>:D:   %42<0,1,0>:D 
>%48<8,8,1>:D
> ADD(16) %54<1>:D:   %43<0,1,0>:D 
>%53<8,8,1>:D
> 
> it can be
> ADD(16) %49<1>:D:   %48<8,8,1>:D   0x
> 0:UD
> ADD(16) %54<1>:D:   %53<8,8,1>:D   0x
> 30:UD
> 
>   Then the MOV can be removed. And after this optimization, ADD 0 can 
> be change
>   to MOV, then local copy propagation can be done.
> 
>   V2: (1) add environment variable to enable/disable the optimization
>   (2) refine the architecture of imm optimization, inherit from 
> global
> optimizer not local block optimizer
> 
>   V3: merge with latest master driver
> 
>   V4: (1)refine some type errors
>   (2)remove UD/D check for no need
>   (3)refine imm calculate for UD/D
> 
> Signed-off-by: rander.wang 
> ---
>  .../src/backend/gen_insn_selection_optimize.cpp| 367
> +++--
>  1 file changed, 342 insertions(+), 25 deletions(-)
> 
> diff --git a/backend/src/backend/gen_insn_selection_optimize.cpp
> b/backend/src/backend/gen_insn_selection_optimize.cpp
> index 07547ec..eb93a20 100644
> --- a/backend/src/backend/gen_insn_selection_optimize.cpp
> +++ b/backend/src/backend/gen_insn_selection_optimize.cpp
> @@ -40,6 +40,33 @@ namespace gbe
>  return elements;
>}
>  
> +  class ReplaceInfo
> +  {
> +  public:
> +ReplaceInfo(SelectionInstruction ,
> +const GenRegister ,
> +const GenRegister ) : insn(insn),
> intermedia(intermedia), replacement(replacement)
> +{
> +  assert(insn.opcode == SEL_OP_MOV || insn.opcode ==
> SEL_OP_ADD);
> +  assert(&(insn.dst(0)) == );
> +  this->elements = CalculateElements(intermedia,
> insn.state.execWidth);
> +  replacementOverwritten = false;
> +}
> +~ReplaceInfo()
> +{
> +  this->toBeReplaceds.clear();
> +}
> +
> +SelectionInstruction 
> +const GenRegister 
> +uint32_t elements;
> +const GenRegister 
> +set toBeReplaceds;
> +set<SelectionInstruction*> toBeReplacedInsns;
> +bool replacementOverwritten;
> +GBE_CLASS(ReplaceInfo);
> +  };
> +
>class SelOptimizer
>{
>public:
> @@ -66,32 +93,7 @@ namespace gbe
>  
>private:
>  // local copy propagation
> -class ReplaceInfo
> -{
> -public:
> -  ReplaceInfo(SelectionInstruction& insn,
> -  const GenRegister& intermedia,
> -  const GenRegister& replacement) :
> -  insn(insn), intermedia(intermedia),
> replacement(replacement)
> -  {
> -assert(insn.opcode == SEL_OP_MOV || insn.opcode ==
> SEL_OP_ADD);
> -assert(&(insn.dst(0)) == );
> -this->elements = CalculateElements(intermedia,
> insn.state.execWidth);
> -replacementOverwritten = false;
> -  }
> -  ~ReplaceInfo()
> -  {
> -this->toBeReplaceds.clear();
> -  }
>  
> -  SelectionInstruction& insn;
> -  const GenRegister& intermedia;
> -  uint32_t elements;
> -  const GenRegister& replacement;
> -  set<GenRegister*> toBeReplaceds;
> -  bool replacementOverwritten;
> -  GBE_CLASS(ReplaceInfo);
> -};
>  typedef map<ir::Register, ReplaceIn

Re: [Beignet] [PATCH V4] backend: add global immediate optimization

2017-06-27 Thread Ivan Shapovalov
On 2017-06-14 at 13:55 +0800, rander.wanga wrote:
> there are some global immediates in global var list of LLVM.
> these imm can be integrated in instructions. for
> compiler_global_immediate_optimized test
> in utest, there are two global immediates:
> L0:
> MOV(1)  %42<0>:UD   :   0x0:UD
> MOV(1)  %43<0>:UD   :   0x30:UD
> 
> used by:
> ADD(16) %49<1>:D:   %42<0,1,0>:D 
>%48<8,8,1>:D
> ADD(16) %54<1>:D:   %43<0,1,0>:D 
>%53<8,8,1>:D
> 
> it can be
> ADD(16) %49<1>:D:   %48<8,8,1>:D   0x
> 0:UD
> ADD(16) %54<1>:D:   %53<8,8,1>:D   0x
> 30:UD
> 
>   Then the MOV can be removed. And after this optimization, ADD 0
> can be change
>   to MOV, then local copy propagation can be done.
> 
>   V2: (1) add environment variable to enable/disable the
> optimization
>   (2) refine the architecture of imm optimization, inherit
> from global
> optimizer not local block optimizer
> 
>   V3: merge with latest master driver
> 
>   V4: (1)refine some type errors
>   (2)remove UD/D check for no need
>   (3)refine imm calculate for UD/D
> 
> Signed-off-by: rander.wang 
> ---
>  .../src/backend/gen_insn_selection_optimize.cpp| 367
> +++--
>  1 file changed, 342 insertions(+), 25 deletions(-)
> 
> diff --git a/backend/src/backend/gen_insn_selection_optimize.cpp
> b/backend/src/backend/gen_insn_selection_optimize.cpp
> index 07547ec..eb93a20 100644
> --- a/backend/src/backend/gen_insn_selection_optimize.cpp
> +++ b/backend/src/backend/gen_insn_selection_optimize.cpp
> @@ -40,6 +40,33 @@ namespace gbe
>  return elements;
>}
>  
> +  class ReplaceInfo
> +  {
> +  public:
> +ReplaceInfo(SelectionInstruction ,
> +const GenRegister ,
> +const GenRegister ) : insn(insn),
> intermedia(intermedia), replacement(replacement)
> +{
> +  assert(insn.opcode == SEL_OP_MOV || insn.opcode ==
> SEL_OP_ADD);
> +  assert(&(insn.dst(0)) == );
> +  this->elements = CalculateElements(intermedia,
> insn.state.execWidth);
> +  replacementOverwritten = false;
> +}
> +~ReplaceInfo()
> +{
> +  this->toBeReplaceds.clear();
> +}
> +
> +SelectionInstruction 
> +const GenRegister 
> +uint32_t elements;
> +const GenRegister 
> +set toBeReplaceds;
> +set toBeReplacedInsns;
> +bool replacementOverwritten;
> +GBE_CLASS(ReplaceInfo);
> +  };
> +
>class SelOptimizer
>{
>public:
> @@ -66,32 +93,7 @@ namespace gbe
>  
>private:
>  // local copy propagation
> -class ReplaceInfo
> -{
> -public:
> -  ReplaceInfo(SelectionInstruction& insn,
> -  const GenRegister& intermedia,
> -  const GenRegister& replacement) :
> -  insn(insn), intermedia(intermedia),
> replacement(replacement)
> -  {
> -assert(insn.opcode == SEL_OP_MOV || insn.opcode ==
> SEL_OP_ADD);
> -assert(&(insn.dst(0)) == );
> -this->elements = CalculateElements(intermedia,
> insn.state.execWidth);
> -replacementOverwritten = false;
> -  }
> -  ~ReplaceInfo()
> -  {
> -this->toBeReplaceds.clear();
> -  }
>  
> -  SelectionInstruction& insn;
> -  const GenRegister& intermedia;
> -  uint32_t elements;
> -  const GenRegister& replacement;
> -  set toBeReplaceds;
> -  bool replacementOverwritten;
> -  GBE_CLASS(ReplaceInfo);
> -};
>  typedef map ReplaceInfoMap;
>  ReplaceInfoMap replaceInfoMap;
>  void doLocalCopyPropagation();
> @@ -298,13 +300,328 @@ namespace gbe
>  virtual void run();
>};
>  
> +  class SelGlobalImmMovOpt : public SelGlobalOptimizer
> +  {
> +  public:
> +SelGlobalImmMovOpt(const GenContext& ctx, uint32_t features,
> intrusive_list *blockList) :
> +  SelGlobalOptimizer(ctx, features)
> +  {
> +mblockList = blockList;
> +  }
> +
> +virtual void run();
> +
> +void addToReplaceInfoMap(SelectionInstruction& insn);
> +void doGlobalCopyPropagation();
> +bool CanBeReplaced(const ReplaceInfo* info,
> SelectionInstruction& insn, const GenRegister& var);
> +void cleanReplaceInfoMap();
> +void doReplacement(ReplaceInfo* info);
> +
> +  private:
> +intrusive_list *mblockList;
> +
> +typedef map ReplaceInfoMap;
> +ReplaceInfoMap replaceInfoMap;
> +
> +  };
> +
> +  extern void outputSelectionInst(SelectionInstruction );
> +
> +  void SelGlobalImmMovOpt::cleanReplaceInfoMap()
> +  {
> +for (auto& pair : replaceInfoMap) {
> +  ReplaceInfo* info = pair.second;
> +  doReplacement(info);
> +  delete info;

Re: [Beignet] [PATCH V4] backend: add global immediate optimization

2017-06-22 Thread Yang, Rong R
Pushed, thanks.

> -Original Message-
> From: Beignet [mailto:beignet-boun...@lists.freedesktop.org] On Behalf Of
> Song, Ruiling
> Sent: Thursday, June 22, 2017 14:30
> To: Wang, Rander <rander.w...@intel.com>; beig...@freedesktop.org
> Cc: Wang, Rander <rander.w...@intel.com>
> Subject: Re: [Beignet] [PATCH V4] backend: add global immediate
> optimization
> 
> LGTM
> 
> Ruiling
> 
> > -Original Message-
> > From: Beignet [mailto:beignet-boun...@lists.freedesktop.org] On Behalf
> > Of rander.wang
> > Sent: Wednesday, June 14, 2017 1:56 PM
> > To: beig...@freedesktop.org
> > Cc: Wang, Rander <rander.w...@intel.com>
> > Subject: [Beignet] [PATCH V4] backend: add global immediate
> > optimization
> >
> > there are some global immediates in global var list of LLVM.
> > these imm can be integrated in instructions. for
> > compiler_global_immediate_optimized test
> > in utest, there are two global immediates:
> > L0:
> > MOV(1)  %42<0>:UD   :   0x0:UD
> > MOV(1)  %43<0>:UD   :   0x30:UD
> >
> > used by:
> > ADD(16) %49<1>:D:   %42<0,1,0>:D
> > %48<8,8,1>:D
> > ADD(16) %54<1>:D:   %43<0,1,0>:D
> > %53<8,8,1>:D
> >
> > it can be
> > ADD(16) %49<1>:D:   %48<8,8,1>:D   0x0:UD
> > ADD(16) %54<1>:D:   %53<8,8,1>:D   0x30:UD
> >
> > Then the MOV can be removed. And after this optimization, ADD 0
> can
> > be change
> > to MOV, then local copy propagation can be done.
> >
> > V2: (1) add environment variable to enable/disable the optimization
> > (2) refine the architecture of imm optimization, inherit from global
> > optimizer not local block optimizer
> >
> > V3: merge with latest master driver
> >
> > V4: (1)refine some type errors
> > (2)remove UD/D check for no need
> > (3)refine imm calculate for UD/D
> >
> > Signed-off-by: rander.wang <rander.w...@intel.com>
> > ---
> >  .../src/backend/gen_insn_selection_optimize.cpp| 367
> > +++--
> >  1 file changed, 342 insertions(+), 25 deletions(-)
> >
> > diff --git a/backend/src/backend/gen_insn_selection_optimize.cpp
> > b/backend/src/backend/gen_insn_selection_optimize.cpp
> > index 07547ec..eb93a20 100644
> > --- a/backend/src/backend/gen_insn_selection_optimize.cpp
> > +++ b/backend/src/backend/gen_insn_selection_optimize.cpp
> > @@ -40,6 +40,33 @@ namespace gbe
> >  return elements;
> >}
> >
> > +  class ReplaceInfo
> > +  {
> > +  public:
> > +ReplaceInfo(SelectionInstruction ,
> > +const GenRegister ,
> > +const GenRegister ) : insn(insn),
> > + intermedia(intermedia),
> > replacement(replacement)
> > +{
> > +  assert(insn.opcode == SEL_OP_MOV || insn.opcode == SEL_OP_ADD);
> > +  assert(&(insn.dst(0)) == );
> > +  this->elements = CalculateElements(intermedia, insn.state.execWidth);
> > +  replacementOverwritten = false;
> > +}
> > +~ReplaceInfo()
> > +{
> > +  this->toBeReplaceds.clear();
> > +}
> > +
> > +SelectionInstruction 
> > +const GenRegister 
> > +uint32_t elements;
> > +const GenRegister 
> > +set toBeReplaceds;
> > +set<SelectionInstruction*> toBeReplacedInsns;
> > +bool replacementOverwritten;
> > +GBE_CLASS(ReplaceInfo);
> > +  };
> > +
> >class SelOptimizer
> >{
> >public:
> > @@ -66,32 +93,7 @@ namespace gbe
> >
> >private:
> >  // local copy propagation
> > -class ReplaceInfo
> > -{
> > -public:
> > -  ReplaceInfo(SelectionInstruction& insn,
> > -  const GenRegister& intermedia,
> > -  const GenRegister& replacement) :
> > -  insn(insn), intermedia(intermedia), 
> > replacement(replacement)
> > -  {
> > -assert(insn.opcode == SEL_OP_MOV || insn.opcode == SEL_OP_ADD);
> > -assert(&(insn.dst(0)) == );
> > -this->elements = CalculateElements(intermedia,
> insn.state.execWidth);
> &g

Re: [Beignet] [PATCH V4] backend: add global immediate optimization

2017-06-22 Thread Song, Ruiling
LGTM

Ruiling

> -Original Message-
> From: Beignet [mailto:beignet-boun...@lists.freedesktop.org] On Behalf Of
> rander.wang
> Sent: Wednesday, June 14, 2017 1:56 PM
> To: beig...@freedesktop.org
> Cc: Wang, Rander 
> Subject: [Beignet] [PATCH V4] backend: add global immediate optimization
> 
> there are some global immediates in global var list of LLVM.
> these imm can be integrated in instructions. for
> compiler_global_immediate_optimized test
> in utest, there are two global immediates:
> L0:
> MOV(1)  %42<0>:UD   :   0x0:UD
> MOV(1)  %43<0>:UD   :   0x30:UD
> 
> used by:
> ADD(16) %49<1>:D:   %42<0,1,0>:D
> %48<8,8,1>:D
> ADD(16) %54<1>:D:   %43<0,1,0>:D
> %53<8,8,1>:D
> 
> it can be
> ADD(16) %49<1>:D:   %48<8,8,1>:D   0x0:UD
> ADD(16) %54<1>:D:   %53<8,8,1>:D   0x30:UD
> 
>   Then the MOV can be removed. And after this optimization, ADD 0 can
> be change
>   to MOV, then local copy propagation can be done.
> 
>   V2: (1) add environment variable to enable/disable the optimization
>   (2) refine the architecture of imm optimization, inherit from global
> optimizer not local block optimizer
> 
>   V3: merge with latest master driver
> 
>   V4: (1)refine some type errors
>   (2)remove UD/D check for no need
>   (3)refine imm calculate for UD/D
> 
> Signed-off-by: rander.wang 
> ---
>  .../src/backend/gen_insn_selection_optimize.cpp| 367
> +++--
>  1 file changed, 342 insertions(+), 25 deletions(-)
> 
> diff --git a/backend/src/backend/gen_insn_selection_optimize.cpp
> b/backend/src/backend/gen_insn_selection_optimize.cpp
> index 07547ec..eb93a20 100644
> --- a/backend/src/backend/gen_insn_selection_optimize.cpp
> +++ b/backend/src/backend/gen_insn_selection_optimize.cpp
> @@ -40,6 +40,33 @@ namespace gbe
>  return elements;
>}
> 
> +  class ReplaceInfo
> +  {
> +  public:
> +ReplaceInfo(SelectionInstruction ,
> +const GenRegister ,
> +const GenRegister ) : insn(insn), 
> intermedia(intermedia),
> replacement(replacement)
> +{
> +  assert(insn.opcode == SEL_OP_MOV || insn.opcode == SEL_OP_ADD);
> +  assert(&(insn.dst(0)) == );
> +  this->elements = CalculateElements(intermedia, insn.state.execWidth);
> +  replacementOverwritten = false;
> +}
> +~ReplaceInfo()
> +{
> +  this->toBeReplaceds.clear();
> +}
> +
> +SelectionInstruction 
> +const GenRegister 
> +uint32_t elements;
> +const GenRegister 
> +set toBeReplaceds;
> +set toBeReplacedInsns;
> +bool replacementOverwritten;
> +GBE_CLASS(ReplaceInfo);
> +  };
> +
>class SelOptimizer
>{
>public:
> @@ -66,32 +93,7 @@ namespace gbe
> 
>private:
>  // local copy propagation
> -class ReplaceInfo
> -{
> -public:
> -  ReplaceInfo(SelectionInstruction& insn,
> -  const GenRegister& intermedia,
> -  const GenRegister& replacement) :
> -  insn(insn), intermedia(intermedia), 
> replacement(replacement)
> -  {
> -assert(insn.opcode == SEL_OP_MOV || insn.opcode == SEL_OP_ADD);
> -assert(&(insn.dst(0)) == );
> -this->elements = CalculateElements(intermedia, insn.state.execWidth);
> -replacementOverwritten = false;
> -  }
> -  ~ReplaceInfo()
> -  {
> -this->toBeReplaceds.clear();
> -  }
> 
> -  SelectionInstruction& insn;
> -  const GenRegister& intermedia;
> -  uint32_t elements;
> -  const GenRegister& replacement;
> -  set toBeReplaceds;
> -  bool replacementOverwritten;
> -  GBE_CLASS(ReplaceInfo);
> -};
>  typedef map ReplaceInfoMap;
>  ReplaceInfoMap replaceInfoMap;
>  void doLocalCopyPropagation();
> @@ -298,13 +300,328 @@ namespace gbe
>  virtual void run();
>};
> 
> +  class SelGlobalImmMovOpt : public SelGlobalOptimizer
> +  {
> +  public:
> +SelGlobalImmMovOpt(const GenContext& ctx, uint32_t features,
> intrusive_list *blockList) :
> +  SelGlobalOptimizer(ctx, features)
> +  {
> +mblockList = blockList;
> +  }
> +
> +virtual void run();
> +
> +void addToReplaceInfoMap(SelectionInstruction& insn);
> +void doGlobalCopyPropagation();
> +bool CanBeReplaced(const ReplaceInfo* info, SelectionInstruction& insn,
> const GenRegister& var);
> +void cleanReplaceInfoMap();
> +void doReplacement(ReplaceInfo* info);
> +
> +  private:
> +intrusive_list *mblockList;
> +
> +typedef map ReplaceInfoMap;
> +ReplaceInfoMap