Re: [RFC, Patch]: Optimized changes in the register used inside loop for LICM and IVOPTS.
On 11/29/2015 09:24 AM, Ajit Kumar Agarwal wrote: I agree with the above. To add up on the above, we only require to calculate the set of objects ( SSA_NAMES) that are live at the birth or the header of the loop. We don't need to calculate the live through the Loop considering Live in and Live out of all the basic blocks of the Loop. This is because the set of objects (SSA_NAMES) That are live-in at the birth or header of the loop will be live-in at every node in the Loop. If a v live out at the header of the loop then the variable is live-in at every node in the Loop. To prove this, Consider a Loop L with header h such that The variable v defined at d is live-in at h. Since v is live at h, d is not part of L. This follows from the dominance property, i.e. h is strictly dominated by d. Furthermore, there exists a path from h to a use of v which does not go through d. For every node of the loop, p, since the loop is strongly connected Component of the CFG, there exists a path, consisting only of nodes of L from p to h. Concatenating those two paths prove that v is live-in and live-out Of p. On top of live-in at the birth or header of the loop as proven above, if we calculate the Live out of the exit block of the block and Live-in at the destination Edge of the exit block of the loops. This consider the liveness outside of the Loop. The above two cases forms the basis of better estimator for register pressure as far as LICM is concerned. If you agree with the above, I will implement add the above in the patch for register_used estimates for better estimate of register pressure for LICM. Yes, I think we're in agreement. jeff
RE: [RFC, Patch]: Optimized changes in the register used inside loop for LICM and IVOPTS.
Hello Jeff: -Original Message- From: Jeff Law [mailto:l...@redhat.com] Sent: Tuesday, November 17, 2015 4:30 AM To: Ajit Kumar Agarwal; GCC Patches Cc: Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala Subject: Re: [RFC, Patch]: Optimized changes in the register used inside loop for LICM and IVOPTS. On 11/16/2015 10:36 AM, Ajit Kumar Agarwal wrote: >> For Induction variable optimization on tree SSA representation, the >> register used logic is based on the number of phi nodes at the loop >> header to represent the liveness at the loop. Current Logic used >> only the number of phi nodes at the loop header. Changes are made to >> represent the phi operands also live at the loop. Thus number of phi >> operands also gets incremented in the number of registers used. >>But my question is why is the # of PHI operands useful here. You'd have a >>stronger argument if it was the number of unique operands in each PHI. >> While I don't doubt this patch improved things, I think it's just putting a >> band-aid over the problem. >>I think anything that just looks at PHIs or at register liveness at loop >>boundaries is inherently underestimating the register pressure implications >>of code motion from inside to outside a >>loop. >>If an object is pulled out of the loop, then it's going to conflict with >>nearly every object that births in the loop (because the object being moved >>now has to live throughout the loop). >>There's exceptions, but I doubt they >>matter in practice. The object is also going to conflict with anything else >>that is live through the loop. At least that's how it seems to me at first >>>>thought. >>So build the set of objects (SSA_NAMEs) that either birth or are live through >>the loop that have the same type class as the object we want to hoist out of >>the loop (scalar, floating point, >>vector). Use that set of objects to >>estimate register pressure. I agree with the above. To add up on the above, we only require to calculate the set of objects ( SSA_NAMES) that are live at the birth or the header of the loop. We don't need to calculate the live through the Loop considering Live in and Live out of all the basic blocks of the Loop. This is because the set of objects (SSA_NAMES) That are live-in at the birth or header of the loop will be live-in at every node in the Loop. If a v live out at the header of the loop then the variable is live-in at every node in the Loop. To prove this, Consider a Loop L with header h such that The variable v defined at d is live-in at h. Since v is live at h, d is not part of L. This follows from the dominance property, i.e. h is strictly dominated by d. Furthermore, there exists a path from h to a use of v which does not go through d. For every node of the loop, p, since the loop is strongly connected Component of the CFG, there exists a path, consisting only of nodes of L from p to h. Concatenating those two paths prove that v is live-in and live-out Of p. On top of live-in at the birth or header of the loop as proven above, if we calculate the Live out of the exit block of the block and Live-in at the destination Edge of the exit block of the loops. This consider the liveness outside of the Loop. The above two cases forms the basis of better estimator for register pressure as far as LICM is concerned. If you agree with the above, I will implement add the above in the patch for register_used estimates for better estimate of register pressure for LICM. Thanks & Regards Ajit >>It won't be exact because some of those objects could end up coloring the >>same. BUt it's probably still considerably more accurate than what we have >>now. >>I suspect that would be a better estimator for register pressure as far as >>LICM is concerned. jeff
Re: [RFC, Patch]: Optimized changes in the register used inside loop for LICM and IVOPTS.
On Tue, Nov 17, 2015 at 1:56 AM, Ajit Kumar Agarwal wrote: > > Sorry I missed out some of the points in earlier mail which is given below. > > -Original Message- > From: Ajit Kumar Agarwal > Sent: Monday, November 16, 2015 11:07 PM > To: 'Jeff Law'; GCC Patches > Cc: Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala > Subject: RE: [RFC, Patch]: Optimized changes in the register used inside loop > for LICM and IVOPTS. > > > > -Original Message- > From: Jeff Law [mailto:l...@redhat.com] > Sent: Friday, November 13, 2015 11:44 AM > To: Ajit Kumar Agarwal; GCC Patches > Cc: Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala > Subject: Re: [RFC, Patch]: Optimized changes in the register used inside loop > for LICM and IVOPTS. > > On 10/07/2015 10:32 PM, Ajit Kumar Agarwal wrote: > >> >> 0001-RFC-Patch-Optimized-changes-in-the-register-used-ins.patch >> >> >> From f164fd80953f3cffd96a492c8424c83290cd43cc Mon Sep 17 00:00:00 >> 2001 >> From: Ajit Kumar Agarwal >> Date: Wed, 7 Oct 2015 20:50:40 +0200 >> Subject: [PATCH] [RFC, Patch]: Optimized changes in the register used inside >> loop for LICM and IVOPTS. >> >> Changes are done in the Loop Invariant(LICM) at RTL level and also the >> Induction variable optimization based on SSA representation. The >> current logic used in LICM for register used inside the loops is >> changed. The Live Out of the loop latch node and the Live in of the >> destination of the exit nodes is used to set the Loops Liveness at the exit >> of the Loop. >> The register used is the number of live variables at the exit of the >> Loop calculated above. >> >> For Induction variable optimization on tree SSA representation, the >> register used logic is based on the number of phi nodes at the loop >> header to represent the liveness at the loop. Current Logic used only >> the number of phi nodes at the loop header. Changes are made to >> represent the phi operands also live at the loop. Thus number of phi >> operands also gets incremented in the number of registers used. >> >> ChangeLog: >> 2015-10-09 Ajit Agarwal >> >> * loop-invariant.c (compute_loop_liveness): New. >> (determine_regs_used): New. >> (find_invariants_to_move): Use of determine_regs_used. >> * tree-ssa-loop-ivopts.c (determine_set_costs): Consider the phi >> arguments for register used. >>>I think Bin rejected the tree-ssa-loop-ivopts change. However, the >>>loop-invariant change is still pending, right? > > >> >> Signed-off-by:Ajit agarwalajit...@xilinx.com >> --- >> gcc/loop-invariant.c | 72 >> +- >> gcc/tree-ssa-loop-ivopts.c | 4 +-- >> 2 files changed, 60 insertions(+), 16 deletions(-) >> >> diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c index >> 52c8ae8..e4291c9 100644 >> --- a/gcc/loop-invariant.c >> +++ b/gcc/loop-invariant.c >> @@ -1413,6 +1413,19 @@ set_move_mark (unsigned invno, int gain) >> } >> } >> >> +static int >> +determine_regs_used() >> +{ >> + unsigned int j; >> + unsigned int reg_used = 2; >> + bitmap_iterator bi; >> + >> + EXECUTE_IF_SET_IN_BITMAP (&LOOP_DATA (curr_loop)->regs_live, 0, j, bi) >> +(reg_used) ++; >> + >> + return reg_used; >> +} >>>Isn't this just bitmap_count_bits (regs_live) + 2? > > >> @@ -2055,9 +2057,43 @@ calculate_loop_reg_pressure (void) >> } >> } >> >> - >> +static void >> +calculate_loop_liveness (void) >>>Needs a function comment. > > I will incorporate the above comments. >> +{ >> + basic_block bb; >> + struct loop *loop; >> >> -/* Move the invariants out of the loops. */ >> + FOR_EACH_LOOP (loop, 0) >> +if (loop->aux == NULL) >> + { >> +loop->aux = xcalloc (1, sizeof (struct loop_data)); >> +bitmap_initialize (&LOOP_DATA (loop)->regs_live, ®_obstack); >> + } >> + >> + FOR_EACH_BB_FN (bb, cfun) >>>Why loop over blocks here? Why not just iterate through all the loops >>>in the loop structure. Order isn't particularly important AFAICT for >>>this code. > > Iterating over the Loop structure is enough. We don't need iterating over the > basic blocks. > >> + { >> + int i; >> + edge e; >> + vec edges; >>
Re: [RFC, Patch]: Optimized changes in the register used inside loop for LICM and IVOPTS.
On 11/16/2015 10:36 AM, Ajit Kumar Agarwal wrote: For Induction variable optimization on tree SSA representation, the register used logic is based on the number of phi nodes at the loop header to represent the liveness at the loop. Current Logic used only the number of phi nodes at the loop header. Changes are made to represent the phi operands also live at the loop. Thus number of phi operands also gets incremented in the number of registers used. But my question is why is the # of PHI operands useful here. You'd have a stronger argument if it was the number of unique operands in each PHI. While I don't doubt this patch improved things, I think it's just putting a band-aid over the problem. I think anything that just looks at PHIs or at register liveness at loop boundaries is inherently underestimating the register pressure implications of code motion from inside to outside a loop. If an object is pulled out of the loop, then it's going to conflict with nearly every object that births in the loop (because the object being moved now has to live throughout the loop). There's exceptions, but I doubt they matter in practice. The object is also going to conflict with anything else that is live through the loop. At least that's how it seems to me at first thought. So build the set of objects (SSA_NAMEs) that either birth or are live through the loop that have the same type class as the object we want to hoist out of the loop (scalar, floating point, vector). Use that set of objects to estimate register pressure. It won't be exact because some of those objects could end up coloring the same. BUt it's probably still considerably more accurate than what we have now. I suspect that would be a better estimator for register pressure as far as LICM is concerned. jeff
RE: [RFC, Patch]: Optimized changes in the register used inside loop for LICM and IVOPTS.
Sorry I missed out some of the points in earlier mail which is given below. -Original Message- From: Ajit Kumar Agarwal Sent: Monday, November 16, 2015 11:07 PM To: 'Jeff Law'; GCC Patches Cc: Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala Subject: RE: [RFC, Patch]: Optimized changes in the register used inside loop for LICM and IVOPTS. -Original Message- From: Jeff Law [mailto:l...@redhat.com] Sent: Friday, November 13, 2015 11:44 AM To: Ajit Kumar Agarwal; GCC Patches Cc: Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala Subject: Re: [RFC, Patch]: Optimized changes in the register used inside loop for LICM and IVOPTS. On 10/07/2015 10:32 PM, Ajit Kumar Agarwal wrote: > > 0001-RFC-Patch-Optimized-changes-in-the-register-used-ins.patch > > > From f164fd80953f3cffd96a492c8424c83290cd43cc Mon Sep 17 00:00:00 > 2001 > From: Ajit Kumar Agarwal > Date: Wed, 7 Oct 2015 20:50:40 +0200 > Subject: [PATCH] [RFC, Patch]: Optimized changes in the register used inside > loop for LICM and IVOPTS. > > Changes are done in the Loop Invariant(LICM) at RTL level and also the > Induction variable optimization based on SSA representation. The > current logic used in LICM for register used inside the loops is > changed. The Live Out of the loop latch node and the Live in of the > destination of the exit nodes is used to set the Loops Liveness at the exit > of the Loop. > The register used is the number of live variables at the exit of the > Loop calculated above. > > For Induction variable optimization on tree SSA representation, the > register used logic is based on the number of phi nodes at the loop > header to represent the liveness at the loop. Current Logic used only > the number of phi nodes at the loop header. Changes are made to > represent the phi operands also live at the loop. Thus number of phi > operands also gets incremented in the number of registers used. > > ChangeLog: > 2015-10-09 Ajit Agarwal > > * loop-invariant.c (compute_loop_liveness): New. > (determine_regs_used): New. > (find_invariants_to_move): Use of determine_regs_used. > * tree-ssa-loop-ivopts.c (determine_set_costs): Consider the phi > arguments for register used. >>I think Bin rejected the tree-ssa-loop-ivopts change. However, the >>loop-invariant change is still pending, right? > > Signed-off-by:Ajit agarwalajit...@xilinx.com > --- > gcc/loop-invariant.c | 72 > +- > gcc/tree-ssa-loop-ivopts.c | 4 +-- > 2 files changed, 60 insertions(+), 16 deletions(-) > > diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c index > 52c8ae8..e4291c9 100644 > --- a/gcc/loop-invariant.c > +++ b/gcc/loop-invariant.c > @@ -1413,6 +1413,19 @@ set_move_mark (unsigned invno, int gain) > } > } > > +static int > +determine_regs_used() > +{ > + unsigned int j; > + unsigned int reg_used = 2; > + bitmap_iterator bi; > + > + EXECUTE_IF_SET_IN_BITMAP (&LOOP_DATA (curr_loop)->regs_live, 0, j, bi) > +(reg_used) ++; > + > + return reg_used; > +} >>Isn't this just bitmap_count_bits (regs_live) + 2? > @@ -2055,9 +2057,43 @@ calculate_loop_reg_pressure (void) > } > } > > - > +static void > +calculate_loop_liveness (void) >>Needs a function comment. I will incorporate the above comments. > +{ > + basic_block bb; > + struct loop *loop; > > -/* Move the invariants out of the loops. */ > + FOR_EACH_LOOP (loop, 0) > +if (loop->aux == NULL) > + { > +loop->aux = xcalloc (1, sizeof (struct loop_data)); > +bitmap_initialize (&LOOP_DATA (loop)->regs_live, ®_obstack); > + } > + > + FOR_EACH_BB_FN (bb, cfun) >>Why loop over blocks here? Why not just iterate through all the loops >>in the loop structure. Order isn't particularly important AFAICT for >>this code. Iterating over the Loop structure is enough. We don't need iterating over the basic blocks. > + { > + int i; > + edge e; > + vec edges; > + edges = get_loop_exit_edges (loop); > + FOR_EACH_VEC_ELT (edges, i, e) > + { > + bitmap_ior_into (&LOOP_DATA (loop)->regs_live, DF_LR_OUT(e->src)); > + bitmap_ior_into (&LOOP_DATA (loop)->regs_live, > + DF_LR_IN(e->dest)); >>Space before the open-paren in the previous two lines DF_LR_OUT >>(e->src) and FD_LR_INT (e->dest)) I will incorporate this. > + } > + } > + } > +} > + > +/* Move the invariants ut of the loops. */ >>Looks lik
RE: [RFC, Patch]: Optimized changes in the register used inside loop for LICM and IVOPTS.
-Original Message- From: Jeff Law [mailto:l...@redhat.com] Sent: Friday, November 13, 2015 11:44 AM To: Ajit Kumar Agarwal; GCC Patches Cc: Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala Subject: Re: [RFC, Patch]: Optimized changes in the register used inside loop for LICM and IVOPTS. On 10/07/2015 10:32 PM, Ajit Kumar Agarwal wrote: > > 0001-RFC-Patch-Optimized-changes-in-the-register-used-ins.patch > > > From f164fd80953f3cffd96a492c8424c83290cd43cc Mon Sep 17 00:00:00 > 2001 > From: Ajit Kumar Agarwal > Date: Wed, 7 Oct 2015 20:50:40 +0200 > Subject: [PATCH] [RFC, Patch]: Optimized changes in the register used inside > loop for LICM and IVOPTS. > > Changes are done in the Loop Invariant(LICM) at RTL level and also the > Induction variable optimization based on SSA representation. The > current logic used in LICM for register used inside the loops is > changed. The Live Out of the loop latch node and the Live in of the > destination of the exit nodes is used to set the Loops Liveness at the exit > of the Loop. > The register used is the number of live variables at the exit of the > Loop calculated above. > > For Induction variable optimization on tree SSA representation, the > register used logic is based on the number of phi nodes at the loop > header to represent the liveness at the loop. Current Logic used only > the number of phi nodes at the loop header. Changes are made to > represent the phi operands also live at the loop. Thus number of phi > operands also gets incremented in the number of registers used. > > ChangeLog: > 2015-10-09 Ajit Agarwal > > * loop-invariant.c (compute_loop_liveness): New. > (determine_regs_used): New. > (find_invariants_to_move): Use of determine_regs_used. > * tree-ssa-loop-ivopts.c (determine_set_costs): Consider the phi > arguments for register used. >>I think Bin rejected the tree-ssa-loop-ivopts change. However, the >>loop-invariant change is still pending, right? > > Signed-off-by:Ajit agarwalajit...@xilinx.com > --- > gcc/loop-invariant.c | 72 > +- > gcc/tree-ssa-loop-ivopts.c | 4 +-- > 2 files changed, 60 insertions(+), 16 deletions(-) > > diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c > index 52c8ae8..e4291c9 100644 > --- a/gcc/loop-invariant.c > +++ b/gcc/loop-invariant.c > @@ -1413,6 +1413,19 @@ set_move_mark (unsigned invno, int gain) > } > } > > +static int > +determine_regs_used() > +{ > + unsigned int j; > + unsigned int reg_used = 2; > + bitmap_iterator bi; > + > + EXECUTE_IF_SET_IN_BITMAP (&LOOP_DATA (curr_loop)->regs_live, 0, j, bi) > +(reg_used) ++; > + > + return reg_used; > +} >>Isn't this just bitmap_count_bits (regs_live) + 2? > @@ -2055,9 +2057,43 @@ calculate_loop_reg_pressure (void) > } > } > > - > +static void > +calculate_loop_liveness (void) >>Needs a function comment. I will incorporate the above comments. > +{ > + basic_block bb; > + struct loop *loop; > > -/* Move the invariants out of the loops. */ > + FOR_EACH_LOOP (loop, 0) > +if (loop->aux == NULL) > + { > +loop->aux = xcalloc (1, sizeof (struct loop_data)); > +bitmap_initialize (&LOOP_DATA (loop)->regs_live, ®_obstack); > + } > + > + FOR_EACH_BB_FN (bb, cfun) >>Why loop over blocks here? Why not just iterate through all the loops >>in the loop structure. Order isn't particularly important AFAICT for >>this code. Iterating over the Loop structure is enough. We don't need iterating over the basic blocks. > + { > + int i; > + edge e; > + vec edges; > + edges = get_loop_exit_edges (loop); > + FOR_EACH_VEC_ELT (edges, i, e) > + { > + bitmap_ior_into (&LOOP_DATA (loop)->regs_live, DF_LR_OUT(e->src)); > + bitmap_ior_into (&LOOP_DATA (loop)->regs_live, DF_LR_IN(e->dest)); >>Space before the open-paren in the previous two lines >>DF_LR_OUT (e->src) and FD_LR_INT (e->dest)) I will incorporate this. > + } > + } > + } > +} > + > +/* Move the invariants ut of the loops. */ >>Looks like you introduced a typo. >>I'd like to see testcases which show the change in # regs used >>computation helping generate better code. We need to measure the test case with the scenario where the new variable created for loop invariant increases the register pressure and the cost with respect to reg_used and new_regs increases that lead to spill and fetch and drop the i
Re: [RFC, Patch]: Optimized changes in the register used inside loop for LICM and IVOPTS.
On Fri, Nov 13, 2015 at 7:31 AM, Bin.Cheng wrote: > On Fri, Nov 13, 2015 at 2:13 PM, Jeff Law wrote: >> On 10/07/2015 10:32 PM, Ajit Kumar Agarwal wrote: >> >>> >>> 0001-RFC-Patch-Optimized-changes-in-the-register-used-ins.patch >>> >>> >>> From f164fd80953f3cffd96a492c8424c83290cd43cc Mon Sep 17 00:00:00 2001 >>> From: Ajit Kumar Agarwal >>> Date: Wed, 7 Oct 2015 20:50:40 +0200 >>> Subject: [PATCH] [RFC, Patch]: Optimized changes in the register used >>> inside >>> loop for LICM and IVOPTS. >>> >>> Changes are done in the Loop Invariant(LICM) at RTL level and also the >>> Induction variable optimization based on SSA representation. The current >>> logic used in LICM for register used inside the loops is changed. The >>> Live Out of the loop latch node and the Live in of the destination of >>> the exit nodes is used to set the Loops Liveness at the exit of the Loop. >>> The register used is the number of live variables at the exit of the >>> Loop calculated above. >>> >>> For Induction variable optimization on tree SSA representation, the >>> register >>> used logic is based on the number of phi nodes at the loop header to >>> represent >>> the liveness at the loop. Current Logic used only the number of phi nodes >>> at >>> the loop header. Changes are made to represent the phi operands also live >>> at >>> the loop. Thus number of phi operands also gets incremented in the number >>> of >>> registers used. >>> >>> ChangeLog: >>> 2015-10-09 Ajit Agarwal >>> >>> * loop-invariant.c (compute_loop_liveness): New. >>> (determine_regs_used): New. >>> (find_invariants_to_move): Use of determine_regs_used. >>> * tree-ssa-loop-ivopts.c (determine_set_costs): Consider the phi >>> arguments for register used. >> >> I think Bin rejected the tree-ssa-loop-ivopts change. However, the >> loop-invariant change is still pending, right? > Ah, reject is a strong word, I am just being dumb and don't understand > why it's a general better estimation yet. > Maybe Richard have some inputs here? Not really. I agree with Bin that the change doesn't look like an improvement by design (might be one by accident for some benchmarks). Richard. > Thanks, > bin >> >> >>> >>> Signed-off-by:Ajit agarwalajit...@xilinx.com >>> --- >>> gcc/loop-invariant.c | 72 >>> +- >>> gcc/tree-ssa-loop-ivopts.c | 4 +-- >>> 2 files changed, 60 insertions(+), 16 deletions(-) >>> >>> diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c >>> index 52c8ae8..e4291c9 100644 >>> --- a/gcc/loop-invariant.c >>> +++ b/gcc/loop-invariant.c >>> @@ -1413,6 +1413,19 @@ set_move_mark (unsigned invno, int gain) >>> } >>> } >>> >>> +static int >>> +determine_regs_used() >>> +{ >>> + unsigned int j; >>> + unsigned int reg_used = 2; >>> + bitmap_iterator bi; >>> + >>> + EXECUTE_IF_SET_IN_BITMAP (&LOOP_DATA (curr_loop)->regs_live, 0, j, bi) >>> +(reg_used) ++; >>> + >>> + return reg_used; >>> +} >> >> Isn't this just bitmap_count_bits (regs_live) + 2? >> >> >>> @@ -2055,9 +2057,43 @@ calculate_loop_reg_pressure (void) >>> } >>> } >>> >>> - >>> +static void >>> +calculate_loop_liveness (void) >> >> Needs a function comment. >> >> >>> +{ >>> + basic_block bb; >>> + struct loop *loop; >>> >>> -/* Move the invariants out of the loops. */ >>> + FOR_EACH_LOOP (loop, 0) >>> +if (loop->aux == NULL) >>> + { >>> +loop->aux = xcalloc (1, sizeof (struct loop_data)); >>> +bitmap_initialize (&LOOP_DATA (loop)->regs_live, ®_obstack); >>> + } >>> + >>> + FOR_EACH_BB_FN (bb, cfun) >> >> Why loop over blocks here? Why not just iterate through all the loops in >> the loop structure. Order isn't particularly important AFAICT for this >> code. >> >> >> >>> + { >>> + int i; >>> + edge e; >>> + vec edges; >>> + edges = get_loop_exit_edges (loop); >>> + FOR_EACH_VEC_ELT (edges, i, e) >>> + { >>> + bitmap_ior_into (&LOOP_DATA (loop)->regs_live, >>> DF_LR_OUT(e->src)); >>> + bitmap_ior_into (&LOOP_DATA (loop)->regs_live, >>> DF_LR_IN(e->dest)); >> >> Space before the open-paren in the previous two lines >> DF_LR_OUT (e->src) and FD_LR_INT (e->dest)) >> >> >>> + } >>> + } >>> + } >>> +} >>> + >>> +/* Move the invariants ut of the loops. */ >> >> Looks like you introduced a typo. >> >> I'd like to see testcases which show the change in # regs used computation >> helping generate better code. >> >> And I'd also like to see some background information on why you think this >> is a more accurate measure for the number of registers used in the loop. >> regs_used AFAICT is supposed to be an estimate of the registers live around >> the loop. So ISTM that you get that value by live-out set on the backedge >> of the loop. I guess you get somethign similar by looking at the exit >> edge's source block's live-out set. But I don't see any value in in
Re: [RFC, Patch]: Optimized changes in the register used inside loop for LICM and IVOPTS.
On Fri, Nov 13, 2015 at 2:13 PM, Jeff Law wrote: > On 10/07/2015 10:32 PM, Ajit Kumar Agarwal wrote: > >> >> 0001-RFC-Patch-Optimized-changes-in-the-register-used-ins.patch >> >> >> From f164fd80953f3cffd96a492c8424c83290cd43cc Mon Sep 17 00:00:00 2001 >> From: Ajit Kumar Agarwal >> Date: Wed, 7 Oct 2015 20:50:40 +0200 >> Subject: [PATCH] [RFC, Patch]: Optimized changes in the register used >> inside >> loop for LICM and IVOPTS. >> >> Changes are done in the Loop Invariant(LICM) at RTL level and also the >> Induction variable optimization based on SSA representation. The current >> logic used in LICM for register used inside the loops is changed. The >> Live Out of the loop latch node and the Live in of the destination of >> the exit nodes is used to set the Loops Liveness at the exit of the Loop. >> The register used is the number of live variables at the exit of the >> Loop calculated above. >> >> For Induction variable optimization on tree SSA representation, the >> register >> used logic is based on the number of phi nodes at the loop header to >> represent >> the liveness at the loop. Current Logic used only the number of phi nodes >> at >> the loop header. Changes are made to represent the phi operands also live >> at >> the loop. Thus number of phi operands also gets incremented in the number >> of >> registers used. >> >> ChangeLog: >> 2015-10-09 Ajit Agarwal >> >> * loop-invariant.c (compute_loop_liveness): New. >> (determine_regs_used): New. >> (find_invariants_to_move): Use of determine_regs_used. >> * tree-ssa-loop-ivopts.c (determine_set_costs): Consider the phi >> arguments for register used. > > I think Bin rejected the tree-ssa-loop-ivopts change. However, the > loop-invariant change is still pending, right? Ah, reject is a strong word, I am just being dumb and don't understand why it's a general better estimation yet. Maybe Richard have some inputs here? Thanks, bin > > >> >> Signed-off-by:Ajit agarwalajit...@xilinx.com >> --- >> gcc/loop-invariant.c | 72 >> +- >> gcc/tree-ssa-loop-ivopts.c | 4 +-- >> 2 files changed, 60 insertions(+), 16 deletions(-) >> >> diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c >> index 52c8ae8..e4291c9 100644 >> --- a/gcc/loop-invariant.c >> +++ b/gcc/loop-invariant.c >> @@ -1413,6 +1413,19 @@ set_move_mark (unsigned invno, int gain) >> } >> } >> >> +static int >> +determine_regs_used() >> +{ >> + unsigned int j; >> + unsigned int reg_used = 2; >> + bitmap_iterator bi; >> + >> + EXECUTE_IF_SET_IN_BITMAP (&LOOP_DATA (curr_loop)->regs_live, 0, j, bi) >> +(reg_used) ++; >> + >> + return reg_used; >> +} > > Isn't this just bitmap_count_bits (regs_live) + 2? > > >> @@ -2055,9 +2057,43 @@ calculate_loop_reg_pressure (void) >> } >> } >> >> - >> +static void >> +calculate_loop_liveness (void) > > Needs a function comment. > > >> +{ >> + basic_block bb; >> + struct loop *loop; >> >> -/* Move the invariants out of the loops. */ >> + FOR_EACH_LOOP (loop, 0) >> +if (loop->aux == NULL) >> + { >> +loop->aux = xcalloc (1, sizeof (struct loop_data)); >> +bitmap_initialize (&LOOP_DATA (loop)->regs_live, ®_obstack); >> + } >> + >> + FOR_EACH_BB_FN (bb, cfun) > > Why loop over blocks here? Why not just iterate through all the loops in > the loop structure. Order isn't particularly important AFAICT for this > code. > > > >> + { >> + int i; >> + edge e; >> + vec edges; >> + edges = get_loop_exit_edges (loop); >> + FOR_EACH_VEC_ELT (edges, i, e) >> + { >> + bitmap_ior_into (&LOOP_DATA (loop)->regs_live, >> DF_LR_OUT(e->src)); >> + bitmap_ior_into (&LOOP_DATA (loop)->regs_live, >> DF_LR_IN(e->dest)); > > Space before the open-paren in the previous two lines > DF_LR_OUT (e->src) and FD_LR_INT (e->dest)) > > >> + } >> + } >> + } >> +} >> + >> +/* Move the invariants ut of the loops. */ > > Looks like you introduced a typo. > > I'd like to see testcases which show the change in # regs used computation > helping generate better code. > > And I'd also like to see some background information on why you think this > is a more accurate measure for the number of registers used in the loop. > regs_used AFAICT is supposed to be an estimate of the registers live around > the loop. So ISTM that you get that value by live-out set on the backedge > of the loop. I guess you get somethign similar by looking at the exit > edge's source block's live-out set. But I don't see any value in including > stuff live at the block outside the loop. > > It also seems fairly non-intuitive. Get the block's latch and use its > live-out set. That seems more intuitive. >
Re: [RFC, Patch]: Optimized changes in the register used inside loop for LICM and IVOPTS.
On 10/07/2015 10:32 PM, Ajit Kumar Agarwal wrote: 0001-RFC-Patch-Optimized-changes-in-the-register-used-ins.patch From f164fd80953f3cffd96a492c8424c83290cd43cc Mon Sep 17 00:00:00 2001 From: Ajit Kumar Agarwal Date: Wed, 7 Oct 2015 20:50:40 +0200 Subject: [PATCH] [RFC, Patch]: Optimized changes in the register used inside loop for LICM and IVOPTS. Changes are done in the Loop Invariant(LICM) at RTL level and also the Induction variable optimization based on SSA representation. The current logic used in LICM for register used inside the loops is changed. The Live Out of the loop latch node and the Live in of the destination of the exit nodes is used to set the Loops Liveness at the exit of the Loop. The register used is the number of live variables at the exit of the Loop calculated above. For Induction variable optimization on tree SSA representation, the register used logic is based on the number of phi nodes at the loop header to represent the liveness at the loop. Current Logic used only the number of phi nodes at the loop header. Changes are made to represent the phi operands also live at the loop. Thus number of phi operands also gets incremented in the number of registers used. ChangeLog: 2015-10-09 Ajit Agarwal * loop-invariant.c (compute_loop_liveness): New. (determine_regs_used): New. (find_invariants_to_move): Use of determine_regs_used. * tree-ssa-loop-ivopts.c (determine_set_costs): Consider the phi arguments for register used. I think Bin rejected the tree-ssa-loop-ivopts change. However, the loop-invariant change is still pending, right? Signed-off-by:Ajit agarwalajit...@xilinx.com --- gcc/loop-invariant.c | 72 +- gcc/tree-ssa-loop-ivopts.c | 4 +-- 2 files changed, 60 insertions(+), 16 deletions(-) diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c index 52c8ae8..e4291c9 100644 --- a/gcc/loop-invariant.c +++ b/gcc/loop-invariant.c @@ -1413,6 +1413,19 @@ set_move_mark (unsigned invno, int gain) } } +static int +determine_regs_used() +{ + unsigned int j; + unsigned int reg_used = 2; + bitmap_iterator bi; + + EXECUTE_IF_SET_IN_BITMAP (&LOOP_DATA (curr_loop)->regs_live, 0, j, bi) +(reg_used) ++; + + return reg_used; +} Isn't this just bitmap_count_bits (regs_live) + 2? @@ -2055,9 +2057,43 @@ calculate_loop_reg_pressure (void) } } - +static void +calculate_loop_liveness (void) Needs a function comment. +{ + basic_block bb; + struct loop *loop; -/* Move the invariants out of the loops. */ + FOR_EACH_LOOP (loop, 0) +if (loop->aux == NULL) + { +loop->aux = xcalloc (1, sizeof (struct loop_data)); +bitmap_initialize (&LOOP_DATA (loop)->regs_live, ®_obstack); + } + + FOR_EACH_BB_FN (bb, cfun) Why loop over blocks here? Why not just iterate through all the loops in the loop structure. Order isn't particularly important AFAICT for this code. + { + int i; + edge e; + vec edges; + edges = get_loop_exit_edges (loop); + FOR_EACH_VEC_ELT (edges, i, e) + { + bitmap_ior_into (&LOOP_DATA (loop)->regs_live, DF_LR_OUT(e->src)); + bitmap_ior_into (&LOOP_DATA (loop)->regs_live, DF_LR_IN(e->dest)); Space before the open-paren in the previous two lines DF_LR_OUT (e->src) and FD_LR_INT (e->dest)) + } + } + } +} + +/* Move the invariants ut of the loops. */ Looks like you introduced a typo. I'd like to see testcases which show the change in # regs used computation helping generate better code. And I'd also like to see some background information on why you think this is a more accurate measure for the number of registers used in the loop. regs_used AFAICT is supposed to be an estimate of the registers live around the loop. So ISTM that you get that value by live-out set on the backedge of the loop. I guess you get somethign similar by looking at the exit edge's source block's live-out set. But I don't see any value in including stuff live at the block outside the loop. It also seems fairly non-intuitive. Get the block's latch and use its live-out set. That seems more intuitive.
RE: [RFC, Patch]: Optimized changes in the register used inside loop for LICM and IVOPTS.
-Original Message- From: Bin.Cheng [mailto:amker.ch...@gmail.com] Sent: Friday, October 09, 2015 8:15 AM To: Ajit Kumar Agarwal Cc: GCC Patches; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala Subject: Re: [RFC, Patch]: Optimized changes in the register used inside loop for LICM and IVOPTS. On Thu, Oct 8, 2015 at 1:53 PM, Ajit Kumar Agarwal wrote: > > > -Original Message- > From: Bin.Cheng [mailto:amker.ch...@gmail.com] > Sent: Thursday, October 08, 2015 10:29 AM > To: Ajit Kumar Agarwal > Cc: GCC Patches; Vinod Kathail; Shail Aditya Gupta; Vidhumouli > Hunsigida; Nagaraju Mekala > Subject: Re: [RFC, Patch]: Optimized changes in the register used inside loop > for LICM and IVOPTS. > > On Thu, Oct 8, 2015 at 12:32 PM, Ajit Kumar Agarwal > wrote: >> Following Proposed: >> >> Changes are done in the Loop Invariant(LICM) at RTL level and also the >> Induction variable optimization based on SSA representation. >> The current logic used in LICM for register used inside the loops is >> changed. The Live Out of the loop latch node and the Live in of the >> destination of the exit nodes is used to set the Loops Liveness at the exit >> of the Loop. The register used is the number of live variables at the exit >> of the Loop calculated above. >> >> For Induction variable optimization on tree SSA representation, the >> register used logic is based on the number of phi nodes at the loop >> header to represent the liveness at the loop. Current Logic used only the >> number of phi nodes at the loop header. I have made changes to represent >> the phi operands also live at the loop. Thus number of phi operands also >> gets incremented in the number of registers used. > Hi, >>>For the GIMPLE IVO part, I don't think the change is reasonable enough. >>>IMHO, IVO fails to restrict iv number in some complex cases, your change >>>tries to >>rectify that by increasing register pressure irrespective to >>>out-of-ssa and coalescing. I think the original code models reg-pressure >>>better, what needs to be >>changed is how we compute cost from register >>>pressure and use that to restrict iv number. > > Considering the liveness with respect to all the phi arguments will > not increase the register pressure. It improves the heuristics for > restricting The IV that increases the register pressure. The cost > model uses regs_used and modelling the >>I think register pressure is increased along with regs_needed, doesn't matter >>if it will be canceled in estimate_reg_pressure_cost for both ends of cost >>>>comparison. >>Liveness with respect to the phi arguments measures > Better register pressure. >>I agree IV number should be controlled for some cases, but not by increasing >>`n' using phi argument number unconditionally. Considering summary >>>>reduction as an example, most likely the ssa names will be coalesced and >>held in single register. Furthermore, there is no reason to count phi >>node/arg >>number for floating point phi nodes. > > Number of phi nodes in the loop header is not only the criteria for > regs_used, but the number of liveness with respect to loop should be Criteria > to measure appropriate register pressure. >>IMHO, it's hard to accurately track liveness info on SSA(PHI), because of >>coalescing etc. So could you give some examples/proof for this? I agree with you that it is hard to predict the exact mapping from SSA to the actual register allocation due to coalescing and out of SSA. The Interference on phi arguments and results are important criteria for register pressure on SSA. The conventional SSA where the phi arguments don't interfere. Most of the current compilers don't have conventional SSA. In the Non-conventional SSA there are chances the phi arguments interfere. The Non-Conventional SSA arises due to the copy propagation of ssa names makes the phi arguments interfere. Due to non-conventional nature of SSA the phi arguments interfere and should be considered for the register used. I interpret the register used as the number of interfering live ranges that leads to increase or decrease in register pressure. On top of the above the Out of SSA or SSA names coalescing, for conventional SSA is quite simple as each phi nodes is assigned to new variables and the def and use is replaced with the new variables and makes the case of assigning single register and then the corresponding phi node is removed. But in the Non- Conventional nature of SSA, the out of ssa makes the SSA conventional by inserting copying to each of the predecessor node and assigned it to new va
Re: [RFC, Patch]: Optimized changes in the register used inside loop for LICM and IVOPTS.
On Thu, Oct 8, 2015 at 1:53 PM, Ajit Kumar Agarwal wrote: > > > -Original Message- > From: Bin.Cheng [mailto:amker.ch...@gmail.com] > Sent: Thursday, October 08, 2015 10:29 AM > To: Ajit Kumar Agarwal > Cc: GCC Patches; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; > Nagaraju Mekala > Subject: Re: [RFC, Patch]: Optimized changes in the register used inside loop > for LICM and IVOPTS. > > On Thu, Oct 8, 2015 at 12:32 PM, Ajit Kumar Agarwal > wrote: >> Following Proposed: >> >> Changes are done in the Loop Invariant(LICM) at RTL level and also the >> Induction variable optimization based on SSA representation. >> The current logic used in LICM for register used inside the loops is >> changed. The Live Out of the loop latch node and the Live in of the >> destination of the exit nodes is used to set the Loops Liveness at the exit >> of the Loop. The register used is the number of live variables at the exit >> of the Loop calculated above. >> >> For Induction variable optimization on tree SSA representation, the >> register used logic is based on the number of phi nodes at the loop >> header to represent the liveness at the loop. Current Logic used only the >> number of phi nodes at the loop header. I have made changes to represent >> the phi operands also live at the loop. Thus number of phi operands also >> gets incremented in the number of registers used. > Hi, >>>For the GIMPLE IVO part, I don't think the change is reasonable enough. >>>IMHO, IVO fails to restrict iv number in some complex cases, your change >>>tries to >>rectify that by increasing register pressure irrespective to >>>out-of-ssa and coalescing. I think the original code models reg-pressure >>>better, what needs to be >>changed is how we compute cost from register >>>pressure and use that to restrict iv number. > > Considering the liveness with respect to all the phi arguments will not > increase the register pressure. It improves the heuristics for restricting > The IV that increases the register pressure. The cost model uses regs_used > and modelling the I think register pressure is increased along with regs_needed, doesn't matter if it will be canceled in estimate_reg_pressure_cost for both ends of cost comparison. Liveness with respect to the phi arguments measures > Better register pressure. I agree IV number should be controlled for some cases, but not by increasing `n' using phi argument number unconditionally. Considering summary reduction as an example, most likely the ssa names will be coalesced and held in single register. Furthermore, there is no reason to count phi node/arg number for floating point phi nodes. > > Number of phi nodes in the loop header is not only the criteria for > regs_used, but the number of liveness with respect to loop should be > Criteria to measure appropriate register pressure. IMHO, it's hard to accurately track liveness info on SSA(PHI), because of coalescing etc. So could you give some examples/proof for this? Thanks, bin > > Thanks & Regards > Ajit
RE: [RFC, Patch]: Optimized changes in the register used inside loop for LICM and IVOPTS.
-Original Message- From: Bin.Cheng [mailto:amker.ch...@gmail.com] Sent: Thursday, October 08, 2015 10:29 AM To: Ajit Kumar Agarwal Cc: GCC Patches; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala Subject: Re: [RFC, Patch]: Optimized changes in the register used inside loop for LICM and IVOPTS. On Thu, Oct 8, 2015 at 12:32 PM, Ajit Kumar Agarwal wrote: > Following Proposed: > > Changes are done in the Loop Invariant(LICM) at RTL level and also the > Induction variable optimization based on SSA representation. > The current logic used in LICM for register used inside the loops is > changed. The Live Out of the loop latch node and the Live in of the > destination of the exit nodes is used to set the Loops Liveness at the exit > of the Loop. The register used is the number of live variables at the exit of > the Loop calculated above. > > For Induction variable optimization on tree SSA representation, the > register used logic is based on the number of phi nodes at the loop > header to represent the liveness at the loop. Current Logic used only the > number of phi nodes at the loop header. I have made changes to represent the > phi operands also live at the loop. Thus number of phi operands also gets > incremented in the number of registers used. Hi, >>For the GIMPLE IVO part, I don't think the change is reasonable enough. >>IMHO, IVO fails to restrict iv number in some complex cases, your change >>tries to >>rectify that by increasing register pressure irrespective to >>out-of-ssa and coalescing. I think the original code models reg-pressure >>better, what needs to be >>changed is how we compute cost from register >>pressure and use that to restrict iv number. Considering the liveness with respect to all the phi arguments will not increase the register pressure. It improves the heuristics for restricting The IV that increases the register pressure. The cost model uses regs_used and modelling the Liveness with respect to the phi arguments measures Better register pressure. Number of phi nodes in the loop header is not only the criteria for regs_used, but the number of liveness with respect to loop should be Criteria to measure appropriate register pressure. Thanks & Regards Ajit >>As for the specific function determine_set_costs, I think one change is >>necessary to rule out all floating point phi nodes, because they do not have >>impact on >>IVO register pressure. Actually this change will further reduce >>register pressure for fp related cases. Thanks, bin > > Performance runs: > > Bootstrapping with i386 goes through fine. The spec cpu 2000 > benchmarks is run and following performance runs and the code size for > i386 target seen. > > Ratio with the above optimization changes vs ratio without above > optimizations for INT benchmarks (3785.261 vs 3783.064). > Ratio with the above optimization changes vs ratio without above optimization > for FP benchmarks ( 4676.763189 vs 4676.072428 ). > > Code size reduction for INT benchmarks : 2324 instructions. > Code size reduction for FP benchmarks : 1283 instructions. > > For Microblaze target the Mibench and EEMBC benchmarks is run and the > following improvements is seen. > > (qos_lite(5.3%), consumer_jpeg_c(1.34%), security_rijndael_d(1.8%), > security_rijndael_e(1.4%)) > > Code Size reduction for Mibench = 16164 instructions. > Code Size reduction for EEMBC = 98 instructions. > > Patch ChangeLog: > > PATCH] [RFC, Patch]: Optimized changes in the register used inside loop for > LICM and IVOPTS. > > Changes are done in the Loop Invariant(LICM) at RTL level and also the > Induction variable optimization based on SSA representation. The current > logic used in LICM for register used inside the loops is changed. > The Live Out of the loop latch node and the Live in of the destination > of the exit nodes is used to set the Loops Liveness at the exit of > the Loop. The register used is the number of live variables at the exit of > the Loop calculated above. > > For Induction variable optimization on tree SSA representation, the > register used logic is based on the number of phi nodes at the loop > header to represent the liveness at the loop. Current Logic used only > the number of phi nodes at the loop header. Changes are made to represent > the phi operands also live at the loop. Thus number of phi operands also > gets incremented in the number of registers used. > > ChangeLog: > 2015-10-09 Ajit Agarwal > > * loop-invariant.c (compute_loop_liveness): New. > (determine_regs_used): New. > (find_invariants_to_move): Use of determine_regs_used. > * tree-ssa-loop-ivopts.c (determine_set_costs): Consider the phi > arguments for register used. > > Signed-off-by:Ajit Agarwal ajit...@xilinx.com > > Thanks & Regards > Ajit
Re: [RFC, Patch]: Optimized changes in the register used inside loop for LICM and IVOPTS.
On Thu, Oct 8, 2015 at 12:32 PM, Ajit Kumar Agarwal wrote: > Following Proposed: > > Changes are done in the Loop Invariant(LICM) at RTL level and also the > Induction variable optimization based on SSA representation. > The current logic used in LICM for register used inside the loops is changed. > The Live Out of the loop latch node and the Live in of the > destination of the exit nodes is used to set the Loops Liveness at the exit > of the Loop. The register used is the number of live variables > at the exit of the Loop calculated above. > > For Induction variable optimization on tree SSA representation, the register > used logic is based on the number of phi nodes at the loop > header to represent the liveness at the loop. Current Logic used only the > number of phi nodes at the loop header. I have made changes > to represent the phi operands also live at the loop. Thus number of phi > operands also gets incremented in the number of registers used. Hi, For the GIMPLE IVO part, I don't think the change is reasonable enough. IMHO, IVO fails to restrict iv number in some complex cases, your change tries to rectify that by increasing register pressure irrespective to out-of-ssa and coalescing. I think the original code models reg-pressure better, what needs to be changed is how we compute cost from register pressure and use that to restrict iv number. As for the specific function determine_set_costs, I think one change is necessary to rule out all floating point phi nodes, because they do not have impact on IVO register pressure. Actually this change will further reduce register pressure for fp related cases. Thanks, bin > > Performance runs: > > Bootstrapping with i386 goes through fine. The spec cpu 2000 benchmarks is > run and following performance runs and the code size for > i386 target seen. > > Ratio with the above optimization changes vs ratio without above > optimizations for INT benchmarks (3785.261 vs 3783.064). > Ratio with the above optimization changes vs ratio without above optimization > for FP benchmarks ( 4676.763189 vs 4676.072428 ). > > Code size reduction for INT benchmarks : 2324 instructions. > Code size reduction for FP benchmarks : 1283 instructions. > > For Microblaze target the Mibench and EEMBC benchmarks is run and the > following improvements is seen. > > (qos_lite(5.3%), consumer_jpeg_c(1.34%), security_rijndael_d(1.8%), > security_rijndael_e(1.4%)) > > Code Size reduction for Mibench = 16164 instructions. > Code Size reduction for EEMBC = 98 instructions. > > Patch ChangeLog: > > PATCH] [RFC, Patch]: Optimized changes in the register used inside loop for > LICM and IVOPTS. > > Changes are done in the Loop Invariant(LICM) at RTL level and also the > Induction variable optimization > based on SSA representation. The current logic used in LICM for register used > inside the loops is changed. > The Live Out of the loop latch node and the Live in of the destination of the > exit nodes is used to set the > Loops Liveness at the exit of the Loop. The register used is the number of > live variables at the exit of the > Loop calculated above. > > For Induction variable optimization on tree SSA representation, the register > used logic is based on the > number of phi nodes at the loop header to represent the liveness at the > loop. Current Logic used only > the number of phi nodes at the loop header. Changes are made to represent > the phi operands also live > at the loop. Thus number of phi operands also gets incremented in the number > of registers used. > > ChangeLog: > 2015-10-09 Ajit Agarwal > > * loop-invariant.c (compute_loop_liveness): New. > (determine_regs_used): New. > (find_invariants_to_move): Use of determine_regs_used. > * tree-ssa-loop-ivopts.c (determine_set_costs): Consider the phi > arguments for register used. > > Signed-off-by:Ajit Agarwal ajit...@xilinx.com > > Thanks & Regards > Ajit