can you get Pathscale's version of dse (or compiler) and see what is
the diff in behavior first?
Sun

On Mon, Jul 25, 2011 at 10:11 PM, Hui Shi <kalin....@gmail.com> wrote:
>
> Hi Fred,
>
> The new enhancement in DSE introduce problem that some chi node is not live
> due to same location with last store.
>
> For fixing the problem from copy propagation.
> " For example, you can leave opt_dse.cxx intact and consider preventing the
> bad copy propagation instead.  At PathScale, we did not do copy propagation
> on chi occurrences, which I think is the root cause of the bug."
>
> In my initial example in this thread, do you mean don't copy propgate sym2v3
> to STMT3?
> It's not the copy propagation in CreateCodeMap.
> This happens after DSE mark sym2 chi node on STMT1 not live.
> DSE::Dead_store_elim calles DSE::Propagate_vsym_bb, propagte sym2v3 to
> STMT3, because sym2v4 = chi(sym2v3) is dead.
>
> I think this logical is correct, otherwise, STMT3's sym2 chi node's operand
> can't directly find a live def.
>
> STMT1
> sym2v4 = chi(sym2v3)  NOT LIVE
> default_vsym_v4 = chi(default_vsym_v3) LIVE
> ....
> STMT2
>   ILOAD mu(default_vsym_v4)
> ...
> STMT3
> sym2v5 = chi(sym2v3) LIVE // sym2v3 is propagate down from above dead chi.
> default_vsym_v5 = chi(default_vsym_v4)
>
> I think this patch is safe to apply on the current DSE.
>
> Regards
> Shi Hui
>
> On Sun, Jul 24, 2011 at 11:18 AM, Fred Chow <frdc...@gmail.com> wrote:
>>
>> My comments below:
>>
>> On 07/23/2011 09:00 AM, Hui Shi wrote:
>>
>> Hi Fred,
>>
>> Thanks for your comments.
>> Comments below.
>>
>> On Sat, Jul 23, 2011 at 8:04 PM, Fred Chow <frdc...@gmail.com> wrote:
>>>
>>> Hi Shi,
>>>
>>> Your change to force other aliased chi nodes to live in
>>> DSE::Set_Reqruied_CHI() is fine with me.  But I'm not comfortable with your
>>> introduction of the _vse_required_set, which I understand is for finding
>>> more dead stores.  More below:
>>>
>>> On 07/22/2011 01:50 AM, Hui Shi wrote:
>>>
>>> Hi Fred,
>>>
>>> Follow your suggestion, if one chi node is live, other aliased chi nodes
>>> in same chi_list should live.
>>> I fix this issue in DSE,  could you help review this patch?
>>>
>>> Fix is:
>>> 1. Add hash_set _vse_requried_set in DSE, record if VER_STAB_ENTRY is
>>> once processed by DSE::Set_Required_VSE.
>>>
>>> Each vse entry is an SSA version of an aux entry.  Thus, from vse to aux
>>> is a many-to-one mapping.  Given a vse, vse->Any_use() already tells me that
>>> this vse has once been required.  Why do you have to set up this
>>> _vse_required_set, and using DSE::VSE_required() to check the information
>>> that is given by vse->Any_use()?  Should it be an aux set instead of a vse
>>> set?
>>
>> One vse entry set in _vse_required_set means this vse entry was once
>> processed by DSE::Set_Required_VSE.
>> However this doesn't means vse->Any_use() is true.
>>
>> I agree that the above is the case with the latest version of
>> opt_dse.cxx.  But in the original version of opt_dse.cxx from PathScale,
>> vse->Any_use() is set to true whenever Set_required_VSE() is entered.  So
>> there has obviously been enhancements put in, and your creation of
>> _vse_required_set is made necessary due to those enhancements.  Since those
>> enhancements has not been reviewed by me, I would appreciate the folks who
>> did those enhancements provide their review opinion on this latest addition.
>>
>> In DSE::Set_Required_VSE( VER_STAB_ENTRY *vse, BOOL real_use, WN *ref_wn
>> ).
>> When VSE's def type is WHIRL statement or PHI, vse->any_use is true.
>> When VSE's def type is CHI, VSE entry any_use will not set true, if AUX's
>> last store write the same mem location with current chi's statement.
>>
>> VSE entry in _vse_required_set means its use in ldid/mu/phi/chi is live
>> then this VSE entry is processed by DSE::Set_Required_VSE.
>> But not necessary means VSE entry's def chi node is live.
>>
>>>
>>> 2. In DSE::Set_Required_CHI
>>>     1. If chi node is live, find aliased/non-live, once required chi
>>> nodes in same chi list. If find, force these chi nodes live.
>>>
>>> If a chi is non-live, then "once required" must be false.  So there is a
>>> contradiction here.  If your intention is "aux-once-required", there is
>>> still a problem, because the aux may be found to be required later.
>>
>> From above explanation, it force live chi nodes that are
>> 1. once processed in DSE::Set_Required_CHI before.
>> 2. not live due to chi node's whirl statment write same mem location with
>> same AUX's last store.
>>
>>>         Set last_store be NULL and call set_required_vse.
>>>     2. If chi node is not live, find aliased chi nodes live in same chi
>>> list. If find, set current chi node live.
>>> 3. Remove previous fix in set_required_mu
>>>
>>> VER_STAB_ENTRY is required means this AUX symbol's some version is really
>>> used in program.
>>> Using this information, we can exclude mark some AUX symbol's chi node
>>> live, if this AUX is never used.
>>> This is meaningful in following case,
>>> int foo(int a) {
>>> int t1, t2;
>>> int *p;
>>>
>>> if(a){
>>>   p = &t1;
>>> }
>>> else {
>>>   p = &t2;
>>> }
>>> t1 =3;
>>> *p = 5;
>>> return t2;
>>> }
>>>
>>> chi nodes live info after original DSE
>>>    I4I4LDID 0 <st 20> T<4,.predef_I4,4>
>>>    I4INTCONST 0 (0x0)
>>>   I4I4NE
>>>   U8LDA 0 <st 4> T<53,anon_ptr.,8>
>>>   U8LDA 0 <st 5> T<53,anon_ptr.,8>
>>>  U8SELECT
>>> U8STID 0 <st 21> T<53,anon_ptr.,8> {line: 1/6} {freq: 0, ln: 6, col: 0}
>>>  I4INTCONST 3 (0x3)
>>> I4STID 0 <st 23> T<4,.predef_I4,4> {line: 1/12} {freq: 0, ln: 12, col:
>>> 0}  // STMT1
>>> sym9v22 <- chi( sym9v19 ) NOT LIVE     // default vsym chi is dead. this
>>> statement is dead
>>>  I4INTCONST 5 (0x5)
>>>  U8U8LDID 0 <st 21> T<53,anon_ptr.,8>
>>> I4ISTORE 0 T<53,anon_ptr.,8> {line: 1/13} {freq: 0, ln: 13, col: 0}
>>> //STMT2
>>> sym4v24 <- chi( sym4v23 ) NOT LIVE  // t1 chi node is dead
>>> sym5v25 <- chi( sym5v15 ) LIVE         // t2 chi node is live
>>> sym8v26 <- chi( sym8v18 ) LIVE          // return vsym chi ndoe is live
>>> sym9v27 <- chi( sym9v22 ) NOT LIVE  // default vsym chi is dead
>>>  I4I4LDID 0 <st 25> T<4,.predef_I4,4>
>>> I4STID 1 <st 28> T<4,.predef_I4,4> {line: 1/14} {freq: 0, ln: 14, col: 0}
>>> RETURN {line: 1/14} {freq: 0, ln: 14, col: 0}
>>>        mu[ sym8v26  sym8v26 ]
>>>
>>> If simply mark alias chi node live, default vsym chi ndoe will live in
>>> STMT2, because it's aliased with t2.
>>> Then follow DU chian, default vsym chi nodes in STMT1 is also live. Cause
>>> STMT1 not dead.
>>>
>>> Actually, default vsym will not be requried from return vsym or t2's DU
>>> chain.
>>> So we need allow aliased chi nodes not live exist, if this chi node is
>>> never required in DSE analysis.
>>>
>>> I consider default vsym as being live at STMT2, because it aliases with
>>> t2.  This will cause STMT1 to be live.  I understand by keeping the default
>>> vsym non-live, you can delete STMT1.  But my intuition is that making the
>>> default vsym non-live (when in fact it is live) will cause other bugs.
>>>
>>
>> default vsym chi nodes in STMT2 is not mark live, because it never
>> processed by DSE::Set_Required_VSE.
>> This means default vsym chi nodes in STMT2 has no explicit use.
>> So there is no requirement for getting def statement from deafult vsym's
>> use to STMT2. That's why I think its okay to leave the default vsym
>> non-live.
>>
>> Is this a legacy problem or not?Is there same problem in DSE's initial
>> version?
>>
>> Before your current fix, and also in the original version from PathScale,
>> default vsym is not live at STMT2.  But with your fix, and without your
>> _vse_required_set mechanism, default vsym would be made live at STMT2.  I
>> understand your effort is in trying to prevent performance regression due to
>> your fix.
>>
>> Strictly speaking, default vsym is live at STMT2, because it aliases with
>> t2 which is used at the return statement.  If this incorrect liveness of the
>> default vsym does not cause any wrong optimization, then it is still OK.
>> Maybe you should consider using a different approach to fix the bug.  For
>> example, you can leave opt_dse.cxx intact and consider preventing the bad
>> copy propagation instead.  At PathScale, we did not do copy propagation on
>> chi occurrences, which I think is the root cause of the bug.
>>
>> Fred
>>
>> Regards
>> Shi Hui
>>
>>
>>> Fred
>>>
>>> Regards
>>> Shi Hui
>>>
>>>
>>>
>>> On Tue, Jul 5, 2011 at 5:16 PM, Fred Chow <frdc...@gmail.com> wrote:
>>>>
>>>> Hi Shi,
>>>>
>>>> Your fix by calling CODEMAP rename is OK in its own right.  But this fix
>>>> is in reality for patching up a bug that was introduced by copy
>>>> propagation.  We should fix the root cause of the problem, NOT to patch it
>>>> so the symptom does not appear.
>>>>
>>>> Basically, the following from your original mail:
>>>>
>>>> STMT1
>>>> sym2v4 = chi(sym2v3)  NOT LIVE
>>>> default_vsym_v4 = chi(default_vsym_v3) LIVE
>>>>
>>>> is already wrong.  sym2v aliases with the default vsym.  Thus, one
>>>> cannot be live while the other is dead.
>>>>
>>>> In looking at the code in opt_dse.cxx, I found that it has diverged a
>>>> lot from the version that I'm familiar with.  Thus, I'm not in a position 
>>>> to
>>>> point out what the root problem is.  At least the version I worked with at
>>>> PathScale does not have this problem.  I would welcome others to provide
>>>> input on this.
>>>>
>>>> Fred
>>>>
>>>> On 07/04/2011 07:37 PM, Hui Shi wrote:
>>>>
>>>> Hi Fred,
>>>>
>>>> Thanks for your comments.
>>>> The reanming happens only when there are some dead phi/chi nodes revive
>>>> in CODEMAP::Iload_folded.
>>>> To evaluate how this patch will affect compile-time, I add dump when
>>>> this renaming happens.
>>>> Tested with SPEC2006 base/peak
>>>> Following cases have CODEMAP rename after create CODEMAPE, and renaming
>>>> count is seldom consider test's PU count..
>>>> 447.dealII   base/peak  rename 20+ times
>>>> 483.xalancbmk  base/peak rename 30+ times
>>>> 416.gamess  peak  rename 4 times
>>>> 458.sjeng     base   rename 1 time.
>>>> So this renaming rarely happens and I think we can afford the increased
>>>> compile time.
>>>>
>>>> And from another perspective. Rename_CODEMAP is already used in other
>>>> phase like aft CGF optimization,  COMP_UNIT::Fold_lda_iload_istore, etc.
>>>> These optimizations and renaming are more likely happen.
>>>>
>>>> One utiIity I hope WOPT can have is CODEMAP rename only rename somve AUX
>>>> (give in a table or list). Then rename cost can be decreased a lot.
>>>>
>>>>
>>>> For copy propagation issue.
>>>> The copy propagation happens in Dead_Store_Elminination,
>>>> DSE::Propagate_vsym_bb.
>>>> To disable the propagation, need change Dead_Store_Elimination. If one
>>>> chi node is live,
>>>> all other chi nodes binding to same statment and set_required should
>>>> live because they are  aliased with each other.
>>>> This will also cause compile time isuee, because more live phi/chi node
>>>> in codrep.
>>>>
>>>> Another issue is though we fix the copy propgation problem, the revive
>>>> logical is still there in CODEMAP::Iload_folded.
>>>> There still possible dead phi/chi nodes revive in CODEMAP creation. So
>>>> complete fix is rename after create CODEMAP.
>>>>
>>>> The main issue is create_codemap not simply create coderep, it also
>>>> perform iload folding may change version.
>>>> If iload folding function is totolay peformed in
>>>> COMP_UNIT::Fold_lda_iload_istore, it has no problem.
>>>>
>>>> Regards
>>>> Shi Hui
>>>>
>>>> On Mon, Jul 4, 2011 at 9:26 PM, Fred Chow <frdc...@gmail.com> wrote:
>>>>>
>>>>> Doing renaming is quite compile-time expensive, because it is like
>>>>> running the SSA construction algorithm.  The problem your mentioned would
>>>>> not have occurred if sym2v3 has not been propagated down to STMT3.  I 
>>>>> don't
>>>>> recall we were doing copy propagation for chi operands.  The overlapped 
>>>>> live
>>>>> range was caused by this bad copy propagation, NOT by the ILOAD folding.
>>>>>
>>>>> The above copy propagation is bad because sym2v4 is NOT dead, because
>>>>> it aliases with the default vsym which appears in the mu of STMT2.  So 
>>>>> your
>>>>> fix should be in preventing the copy propagation.
>>>>>
>>>>> Fred
>>>>>
>>>>> On 06/29/2011 11:35 PM, Hui Shi wrote:
>>>>>
>>>>> Could gatekeeper help review this fix?
>>>>>
>>>>> In SSA::Create_CODEMAP ILOAD folding is performed during coderep setup.
>>>>> The problem is iload folding may revive some dead phi/chi node.
>>>>>
>>>>> For example
>>>>> STMT1
>>>>> sym2v4 = chi(sym2v3)  NOT LIVE
>>>>> default_vsym_v4 = chi(default_vsym_v3) LIVE
>>>>> ....
>>>>> STMT2
>>>>>   ILOAD mu(default_vsym_v4)
>>>>> ...
>>>>> STMT3
>>>>> sym2v5 = chi(sym2v3) LIVE // sym2v3 is propagate down from above dead
>>>>> chi.
>>>>> default_vsym_v5 = chi(default_vsym_v4)
>>>>>
>>>>> If sym2v4 may use at STMT2, ILOAD's MU node must on STMT1's chi list
>>>>> (because MU node must alias with sym2, otherwise its alias issue).
>>>>> And MU node's corresponding CHI node must live on STMT1, because
>>>>> set_required_mu will set chi node live.
>>>>> So it's okay for DSE to mark fisrt CHI dead if STMT1 and STMT3 write
>>>>> same memory location.
>>>>> 1. It will not make STMT1 dead, because the other chi node is live.
>>>>> 2. ILOAD's MU node's def statment still STMT1
>>>>>
>>>>> If ILOAD folding transform ILOAD to LDID and LDID's AUX is sym2.
>>>>> ILOD_FOLDING revives dead chi sym2v4 = chi(sym2v3) without update the
>>>>> late chi node's opnd.
>>>>>
>>>>> This introduce overlapped between sym2v3 and sym2v4.
>>>>> Fix is rename CODEMAP if iload folding revive dead phi/chi node.
>>>>> OPT_REVISE_SSA::Fold_lda_indirects also rename CODEMAP after
>>>>> optimization.
>>>>>
>>>>> Fix is
>>>>> 1. record in htable, if dead chi/phi node is revived in
>>>>> SSA::Create_CODEMAP.
>>>>>    This only happends when ILOAD_Folding happens and find corresponding
>>>>> chi/phi is dead.
>>>>> 2. After SSA::Create_CODEMAP, rename coderep if dead chi/phi node is
>>>>> revive in CODEMAP setup.
>>>>>
>>>>> Different with fixing in DSE to mark first chi node live, this fix not
>>>>> affect DSE, so not affect performance.
>>>>> And this is verified with spec2006.rate on xeon.
>>>>>
>>>>> Regards
>>>>> Shi Hui
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------------------
>>>>> All of the data generated in your IT infrastructure is seriously
>>>>> valuable.
>>>>> Why? It contains a definitive record of application performance,
>>>>> security
>>>>> threats, fraudulent activity, and more. Splunk takes this data and
>>>>> makes
>>>>> sense of it. IT sense. And common sense.
>>>>> http://p.sf.net/sfu/splunk-d2d-c2
>>>>>
>>>>> _______________________________________________
>>>>> Open64-devel mailing list
>>>>> Open64-devel@lists.sourceforge.net
>>>>> https://lists.sourceforge.net/lists/listinfo/open64-devel
>>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
>
> ------------------------------------------------------------------------------
> Storage Efficiency Calculator
> This modeling tool is based on patent-pending intellectual property that
> has been used successfully in hundreds of IBM storage optimization engage-
> ments, worldwide.  Store less, Store more with what you own, Move data to
> the right place. Try It Now!
> http://www.accelacomm.com/jaw/sfnl/114/51427378/
> _______________________________________________
> Open64-devel mailing list
> Open64-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/open64-devel
>
>

------------------------------------------------------------------------------
Storage Efficiency Calculator
This modeling tool is based on patent-pending intellectual property that
has been used successfully in hundreds of IBM storage optimization engage-
ments, worldwide.  Store less, Store more with what you own, Move data to 
the right place. Try It Now! http://www.accelacomm.com/jaw/sfnl/114/51427378/
_______________________________________________
Open64-devel mailing list
Open64-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/open64-devel

Reply via email to