Hi Fred,
The new enhancement in DSE introduce problem that some chi node is not live
due to same location with last store.
For fixing the problem from copy propagation.
" For example, you can leave opt_dse.cxx intact and consider preventing the
bad copy propagation instead. At PathScale, we did not do copy propagation
on chi occurrences, which I think is the root cause of the bug."
In my initial example in this thread, do you mean don't copy propgate sym2v3
to STMT3?
It's not the copy propagation in CreateCodeMap.
This happens after DSE mark sym2 chi node on STMT1 not live.
DSE::Dead_store_elim calles DSE::Propagate_vsym_bb, propagte sym2v3 to
STMT3, because sym2v4 = chi(sym2v3) is dead.
I think this logical is correct, otherwise, STMT3's sym2 chi node's operand
can't directly find a live def.
STMT1
sym2v4 = chi(sym2v3) NOT LIVE
default_vsym_v4 = chi(default_vsym_v3) LIVE
....
STMT2
ILOAD mu(default_vsym_v4)
...
STMT3
sym2v5 = chi(sym2v3) LIVE // sym2v3 is propagate down from above dead chi.
default_vsym_v5 = chi(default_vsym_v4)
I think this patch is safe to apply on the current DSE.
Regards
Shi Hui
On Sun, Jul 24, 2011 at 11:18 AM, Fred Chow <frdc...@gmail.com> wrote:
> **
> My comments below:
>
>
> On 07/23/2011 09:00 AM, Hui Shi wrote:
>
> Hi Fred,
>
> Thanks for your comments.
> Comments below.
>
> On Sat, Jul 23, 2011 at 8:04 PM, Fred Chow <frdc...@gmail.com> wrote:
>
>> Hi Shi,
>>
>> Your change to force other aliased chi nodes to live in
>> DSE::Set_Reqruied_CHI() is fine with me. But I'm not comfortable with your
>> introduction of the _vse_required_set, which I understand is for finding
>> more dead stores. More below:
>>
>>
>> On 07/22/2011 01:50 AM, Hui Shi wrote:
>>
>> Hi Fred,
>>
>> Follow your suggestion, if one chi node is live, other aliased chi nodes
>> in same chi_list should live.
>> I fix this issue in DSE, could you help review this patch?
>>
>> Fix is:
>> 1. Add hash_set _vse_requried_set in DSE, record if VER_STAB_ENTRY is once
>> processed by DSE::Set_Required_VSE.
>>
>> Each vse entry is an SSA version of an aux entry. Thus, from vse to aux
>> is a many-to-one mapping. Given a vse, vse->Any_use() already tells me that
>> this vse has once been required. Why do you have to set up this
>> _vse_required_set, and using DSE::VSE_required() to check the information
>> that is given by vse->Any_use()? Should it be an aux set instead of a vse
>> set?
>>
>
> One vse entry set in _vse_required_set means this vse entry was once
> processed by DSE::Set_Required_VSE.
> However this doesn't means vse->Any_use() is true.
>
> I agree that the above is the case with the latest version of opt_dse.cxx.
> But in the original version of opt_dse.cxx from PathScale, vse->Any_use() is
> set to true whenever Set_required_VSE() is entered. So there has obviously
> been enhancements put in, and your creation of _vse_required_set is made
> necessary due to those enhancements. Since those enhancements has not been
> reviewed by me, I would appreciate the folks who did those enhancements
> provide their review opinion on this latest addition.
>
> In DSE::Set_Required_VSE( VER_STAB_ENTRY *vse, BOOL real_use, WN *ref_wn
> ).
> When VSE's def type is WHIRL statement or PHI, vse->any_use is true.
> When VSE's def type is CHI, VSE entry any_use will not set true, if AUX's
> last store write the same mem location with current chi's statement.
>
> VSE entry in _vse_required_set means its use in ldid/mu/phi/chi is live
> then this VSE entry is processed by DSE::Set_Required_VSE.
> But not necessary means VSE entry's def chi node is live.
>
>
>> 2. In DSE::Set_Required_CHI
>> 1. If chi node is live, find aliased/non-live, once required chi nodes
>> in same chi list. If find, force these chi nodes live.
>>
>> If a chi is non-live, then "once required" must be false. So there is a
>> contradiction here. If your intention is "aux-once-required", there is
>> still a problem, because the aux may be found to be required later.
>>
>>
> From above explanation, it force live chi nodes that are
> 1. once processed in DSE::Set_Required_CHI before.
> 2. not live due to chi node's whirl statment write same mem location with
> same AUX's last store.
>
> Set last_store be NULL and call set_required_vse.
>> 2. If chi node is not live, find aliased chi nodes live in same chi
>> list. If find, set current chi node live.
>> 3. Remove previous fix in set_required_mu
>>
>> VER_STAB_ENTRY is required means this AUX symbol's some version is really
>> used in program.
>> Using this information, we can exclude mark some AUX symbol's chi node
>> live, if this AUX is never used.
>> This is meaningful in following case,
>> int foo(int a) {
>> int t1, t2;
>> int *p;
>>
>> if(a){
>> p = &t1;
>> }
>> else {
>> p = &t2;
>> }
>> t1 =3;
>> *p = 5;
>> return t2;
>> }
>>
>> chi nodes live info after original DSE
>> I4I4LDID 0 <st 20> T<4,.predef_I4,4>
>> I4INTCONST 0 (0x0)
>> I4I4NE
>> U8LDA 0 <st 4> T<53,anon_ptr.,8>
>> U8LDA 0 <st 5> T<53,anon_ptr.,8>
>> U8SELECT
>> U8STID 0 <st 21> T<53,anon_ptr.,8> {line: 1/6} {freq: 0, ln: 6, col: 0}
>> I4INTCONST 3 (0x3)
>> I4STID 0 <st 23> T<4,.predef_I4,4> {line: 1/12} {freq: 0, ln: 12, col: 0}
>> // STMT1
>> sym9v22 <- chi( sym9v19 ) NOT LIVE // default vsym chi is dead. this
>> statement is dead
>> I4INTCONST 5 (0x5)
>> U8U8LDID 0 <st 21> T<53,anon_ptr.,8>
>> I4ISTORE 0 T<53,anon_ptr.,8> {line: 1/13} {freq: 0, ln: 13, col: 0}
>> //STMT2
>> sym4v24 <- chi( sym4v23 ) NOT LIVE // t1 chi node is dead
>> sym5v25 <- chi( sym5v15 ) LIVE // t2 chi node is live
>> sym8v26 <- chi( sym8v18 ) LIVE // return vsym chi ndoe is live
>> sym9v27 <- chi( sym9v22 ) NOT LIVE // default vsym chi is dead
>> I4I4LDID 0 <st 25> T<4,.predef_I4,4>
>> I4STID 1 <st 28> T<4,.predef_I4,4> {line: 1/14} {freq: 0, ln: 14, col: 0}
>> RETURN {line: 1/14} {freq: 0, ln: 14, col: 0}
>> mu[ sym8v26 sym8v26 ]
>>
>> If simply mark alias chi node live, default vsym chi ndoe will live in
>> STMT2, because it's aliased with t2.
>> Then follow DU chian, default vsym chi nodes in STMT1 is also live. Cause
>> STMT1 not dead.
>>
>> Actually, default vsym will not be requried from return vsym or t2's DU
>> chain.
>> So we need allow aliased chi nodes not live exist, if this chi node is
>> never required in DSE analysis.
>>
>> I consider default vsym as being live at STMT2, because it aliases with
>> t2. This will cause STMT1 to be live. I understand by keeping the default
>> vsym non-live, you can delete STMT1. But my intuition is that making the
>> default vsym non-live (when in fact it is live) will cause other bugs.
>>
>>
> default vsym chi nodes in STMT2 is not mark live, because it never
> processed by DSE::Set_Required_VSE.
> This means default vsym chi nodes in STMT2 has no explicit use.
> So there is no requirement for getting def statement from deafult vsym's
> use to STMT2. That's why I think its okay to leave the default vsym
> non-live.
>
> Is this a legacy problem or not?Is there same problem in DSE's initial
> version?
>
> Before your current fix, and also in the original version from PathScale,
> default vsym is not live at STMT2. But with your fix, and without your
> _vse_required_set mechanism, default vsym would be made live at STMT2. I
> understand your effort is in trying to prevent performance regression due to
> your fix.
>
> Strictly speaking, default vsym is live at STMT2, because it aliases with
> t2 which is used at the return statement. If this incorrect liveness of the
> default vsym does not cause any wrong optimization, then it is still OK.
> Maybe you should consider using a different approach to fix the bug. For
> example, you can leave opt_dse.cxx intact and consider preventing the bad
> copy propagation instead. At PathScale, we did not do copy propagation on
> chi occurrences, which I think is the root cause of the bug.
>
> Fred
>
>
> Regards
> Shi Hui
>
>
> Fred
>>
>>
>> Regards
>> Shi Hui
>>
>>
>>
>> On Tue, Jul 5, 2011 at 5:16 PM, Fred Chow <frdc...@gmail.com> wrote:
>>
>>> Hi Shi,
>>>
>>> Your fix by calling CODEMAP rename is OK in its own right. But this fix
>>> is in reality for patching up a bug that was introduced by copy
>>> propagation. We should fix the root cause of the problem, NOT to patch it
>>> so the symptom does not appear.
>>>
>>> Basically, the following from your original mail:
>>>
>>>
>>> STMT1
>>> sym2v4 = chi(sym2v3) NOT LIVE
>>> default_vsym_v4 = chi(default_vsym_v3) LIVE
>>>
>>> is already wrong. sym2v aliases with the default vsym. Thus, one
>>> cannot be live while the other is dead.
>>>
>>> In looking at the code in opt_dse.cxx, I found that it has diverged a lot
>>> from the version that I'm familiar with. Thus, I'm not in a position to
>>> point out what the root problem is. At least the version I worked with at
>>> PathScale does not have this problem. I would welcome others to provide
>>> input on this.
>>>
>>> Fred
>>>
>>>
>>> On 07/04/2011 07:37 PM, Hui Shi wrote:
>>>
>>> Hi Fred,
>>>
>>> Thanks for your comments.
>>> The reanming happens only when there are some dead phi/chi nodes revive
>>> in CODEMAP::Iload_folded.
>>> To evaluate how this patch will affect compile-time, I add dump when this
>>> renaming happens.
>>> Tested with SPEC2006 base/peak
>>> Following cases have CODEMAP rename after create CODEMAPE, and renaming
>>> count is seldom consider test's PU count..
>>> 447.dealII base/peak rename 20+ times
>>> 483.xalancbmk base/peak rename 30+ times
>>> 416.gamess peak rename 4 times
>>> 458.sjeng base rename 1 time.
>>> So this renaming rarely happens and I think we can afford the increased
>>> compile time.
>>>
>>> And from another perspective. Rename_CODEMAP is already used in other
>>> phase like aft CGF optimization, COMP_UNIT::Fold_lda_iload_istore, etc.
>>> These optimizations and renaming are more likely happen.
>>>
>>> One utiIity I hope WOPT can have is CODEMAP rename only rename somve AUX
>>> (give in a table or list). Then rename cost can be decreased a lot.
>>>
>>>
>>> For copy propagation issue.
>>> The copy propagation happens in Dead_Store_Elminination,
>>> DSE::Propagate_vsym_bb.
>>> To disable the propagation, need change Dead_Store_Elimination. If one
>>> chi node is live,
>>> all other chi nodes binding to same statment and set_required should live
>>> because they are aliased with each other.
>>> This will also cause compile time isuee, because more live phi/chi node
>>> in codrep.
>>>
>>> Another issue is though we fix the copy propgation problem, the revive
>>> logical is still there in CODEMAP::Iload_folded.
>>> There still possible dead phi/chi nodes revive in CODEMAP creation. So
>>> complete fix is rename after create CODEMAP.
>>>
>>> The main issue is create_codemap not simply create coderep, it also
>>> perform iload folding may change version.
>>> If iload folding function is totolay peformed in
>>> COMP_UNIT::Fold_lda_iload_istore, it has no problem.
>>>
>>> Regards
>>> Shi Hui
>>>
>>> On Mon, Jul 4, 2011 at 9:26 PM, Fred Chow <frdc...@gmail.com> wrote:
>>>
>>>> Doing renaming is quite compile-time expensive, because it is like
>>>> running the SSA construction algorithm. The problem your mentioned would
>>>> not have occurred if sym2v3 has not been propagated down to STMT3. I don't
>>>> recall we were doing copy propagation for chi operands. The overlapped
>>>> live
>>>> range was caused by this bad copy propagation, NOT by the ILOAD folding.
>>>>
>>>> The above copy propagation is bad because sym2v4 is NOT dead, because it
>>>> aliases with the default vsym which appears in the mu of STMT2. So your
>>>> fix
>>>> should be in preventing the copy propagation.
>>>>
>>>> Fred
>>>>
>>>>
>>>> On 06/29/2011 11:35 PM, Hui Shi wrote:
>>>>
>>>> Could gatekeeper help review this fix?
>>>>
>>>> In SSA::Create_CODEMAP ILOAD folding is performed during coderep setup.
>>>> The problem is iload folding may revive some dead phi/chi node.
>>>>
>>>> For example
>>>> STMT1
>>>> sym2v4 = chi(sym2v3) NOT LIVE
>>>> default_vsym_v4 = chi(default_vsym_v3) LIVE
>>>> ....
>>>> STMT2
>>>> ILOAD mu(default_vsym_v4)
>>>> ...
>>>> STMT3
>>>> sym2v5 = chi(sym2v3) LIVE // sym2v3 is propagate down from above dead
>>>> chi.
>>>> default_vsym_v5 = chi(default_vsym_v4)
>>>>
>>>> If sym2v4 may use at STMT2, ILOAD's MU node must on STMT1's chi list
>>>> (because MU node must alias with sym2, otherwise its alias issue).
>>>> And MU node's corresponding CHI node must live on STMT1, because
>>>> set_required_mu will set chi node live.
>>>> So it's okay for DSE to mark fisrt CHI dead if STMT1 and STMT3 write
>>>> same memory location.
>>>> 1. It will not make STMT1 dead, because the other chi node is live.
>>>> 2. ILOAD's MU node's def statment still STMT1
>>>>
>>>> If ILOAD folding transform ILOAD to LDID and LDID's AUX is sym2.
>>>> ILOD_FOLDING revives dead chi sym2v4 = chi(sym2v3) without update the
>>>> late chi node's opnd.
>>>>
>>>> This introduce overlapped between sym2v3 and sym2v4.
>>>> Fix is rename CODEMAP if iload folding revive dead phi/chi node.
>>>> OPT_REVISE_SSA::Fold_lda_indirects also rename CODEMAP after
>>>> optimization.
>>>>
>>>> Fix is
>>>> 1. record in htable, if dead chi/phi node is revived in
>>>> SSA::Create_CODEMAP.
>>>> This only happends when ILOAD_Folding happens and find corresponding
>>>> chi/phi is dead.
>>>> 2. After SSA::Create_CODEMAP, rename coderep if dead chi/phi node is
>>>> revive in CODEMAP setup.
>>>>
>>>> Different with fixing in DSE to mark first chi node live, this fix not
>>>> affect DSE, so not affect performance.
>>>> And this is verified with spec2006.rate on xeon.
>>>>
>>>> Regards
>>>> Shi Hui
>>>>
>>>>
>>>> ------------------------------------------------------------------------------
>>>> All of the data generated in your IT infrastructure is seriously valuable.
>>>> Why? It contains a definitive record of application performance, security
>>>> threats, fraudulent activity, and more. Splunk takes this data and makes
>>>> sense of it. IT sense. And common sense.
>>>> http://p.sf.net/sfu/splunk-d2d-c2
>>>>
>>>>
>>>> _______________________________________________
>>>> Open64-devel mailing
>>>> listOpen64-devel@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/open64-devel
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
>
------------------------------------------------------------------------------
Storage Efficiency Calculator
This modeling tool is based on patent-pending intellectual property that
has been used successfully in hundreds of IBM storage optimization engage-
ments, worldwide. Store less, Store more with what you own, Move data to
the right place. Try It Now! http://www.accelacomm.com/jaw/sfnl/114/51427378/
_______________________________________________
Open64-devel mailing list
Open64-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/open64-devel