My comments below:
On 07/23/2011 09:00 AM, Hui Shi wrote:
Hi Fred,
Thanks for your comments.
Comments below.
On Sat, Jul 23, 2011 at 8:04 PM, Fred Chow <frdc...@gmail.com
<mailto:frdc...@gmail.com>> wrote:
Hi Shi,
Your change to force other aliased chi nodes to live in
DSE::Set_Reqruied_CHI() is fine with me. But I'm not comfortable
with your introduction of the _vse_required_set, which I
understand is for finding more dead stores. More below:
On 07/22/2011 01:50 AM, Hui Shi wrote:
Hi Fred,
Follow your suggestion, if one chi node is live, other aliased
chi nodes in same chi_list should live.
I fix this issue in DSE, could you help review this patch?
Fix is:
1. Add hash_set _vse_requried_set in DSE, record if
VER_STAB_ENTRY is once processed by DSE::Set_Required_VSE.
Each vse entry is an SSA version of an aux entry. Thus, from vse
to aux is a many-to-one mapping. Given a vse, vse->Any_use()
already tells me that this vse has once been required. Why do you
have to set up this _vse_required_set, and using
DSE::VSE_required() to check the information that is given by
vse->Any_use()? Should it be an aux set instead of a vse set?
One vse entry set in _vse_required_set means this vse entry was once
processed by DSE::Set_Required_VSE.
However this doesn't means vse->Any_use() is true.
I agree that the above is the case with the latest version of
opt_dse.cxx. But in the original version of opt_dse.cxx from PathScale,
vse->Any_use() is set to true whenever Set_required_VSE() is entered.
So there has obviously been enhancements put in, and your creation of
_vse_required_set is made necessary due to those enhancements. Since
those enhancements has not been reviewed by me, I would appreciate the
folks who did those enhancements provide their review opinion on this
latest addition.
In DSE::Set_Required_VSE( VER_STAB_ENTRY *vse, BOOL real_use, WN
*ref_wn ).
When VSE's def type is WHIRL statement or PHI, vse->any_use is true.
When VSE's def type is CHI, VSE entry any_use will not set true, if
AUX's last store write the same mem location with current chi's statement.
VSE entry in _vse_required_set means its use in ldid/mu/phi/chi is
live then this VSE entry is processed by DSE::Set_Required_VSE.
But not necessary means VSE entry's def chi node is live.
2. In DSE::Set_Required_CHI
1. If chi node is live, find aliased/non-live, once required
chi nodes in same chi list. If find, force these chi nodes live.
If a chi is non-live, then "once required" must be false. So
there is a contradiction here. If your intention is
"aux-once-required", there is still a problem, because the aux may
be found to be required later.
From above explanation, it force live chi nodes that are
1. once processed in DSE::Set_Required_CHI before.
2. not live due to chi node's whirl statment write same mem location
with same AUX's last store.
Set last_store be NULL and call set_required_vse.
2. If chi node is not live, find aliased chi nodes live in
same chi list. If find, set current chi node live.
3. Remove previous fix in set_required_mu
VER_STAB_ENTRY is required means this AUX symbol's some version
is really used in program.
Using this information, we can exclude mark some AUX symbol's chi
node live, if this AUX is never used.
This is meaningful in following case,
int foo(int a) {
int t1, t2;
int *p;
if(a){
p = &t1;
}
else {
p = &t2;
}
t1 =3;
*p = 5;
return t2;
}
chi nodes live info after original DSE
I4I4LDID 0 <st 20> T<4,.predef_I4,4>
I4INTCONST 0 (0x0)
I4I4NE
U8LDA 0 <st 4> T<53,anon_ptr.,8>
U8LDA 0 <st 5> T<53,anon_ptr.,8>
U8SELECT
U8STID 0 <st 21> T<53,anon_ptr.,8> {line: 1/6} {freq: 0, ln: 6,
col: 0}
I4INTCONST 3 (0x3)
I4STID 0 <st 23> T<4,.predef_I4,4> {line: 1/12} {freq: 0, ln: 12,
col: 0} // STMT1
sym9v22 <- chi( sym9v19 ) NOT LIVE // default vsym chi is
dead. this statement is dead
I4INTCONST 5 (0x5)
U8U8LDID 0 <st 21> T<53,anon_ptr.,8>
I4ISTORE 0 T<53,anon_ptr.,8> {line: 1/13} {freq: 0, ln: 13, col:
0} //STMT2
sym4v24 <- chi( sym4v23 ) NOT LIVE // t1 chi node is dead
sym5v25 <- chi( sym5v15 ) LIVE // t2 chi node is live
sym8v26 <- chi( sym8v18 ) LIVE // return vsym chi ndoe
is live
sym9v27 <- chi( sym9v22 ) NOT LIVE // default vsym chi is dead
I4I4LDID 0 <st 25> T<4,.predef_I4,4>
I4STID 1 <st 28> T<4,.predef_I4,4> {line: 1/14} {freq: 0, ln: 14,
col: 0}
RETURN {line: 1/14} {freq: 0, ln: 14, col: 0}
mu[ sym8v26 sym8v26 ]
If simply mark alias chi node live, default vsym chi ndoe will
live in STMT2, because it's aliased with t2.
Then follow DU chian, default vsym chi nodes in STMT1 is also
live. Cause STMT1 not dead.
Actually, default vsym will not be requried from return vsym or
t2's DU chain.
So we need allow aliased chi nodes not live exist, if this chi
node is never required in DSE analysis.
I consider default vsym as being live at STMT2, because it aliases
with t2. This will cause STMT1 to be live. I understand by
keeping the default vsym non-live, you can delete STMT1. But my
intuition is that making the default vsym non-live (when in fact
it is live) will cause other bugs.
default vsym chi nodes in STMT2 is not mark live, because it never
processed by DSE::Set_Required_VSE.
This means default vsym chi nodes in STMT2 has no explicit use.
So there is no requirement for getting def statement from deafult
vsym's use to STMT2. That's why I think its okay to leave the default
vsym non-live.
Is this a legacy problem or not?Is there same problem in DSE's initial
version?
Before your current fix, and also in the original version from
PathScale, default vsym is not live at STMT2. But with your fix, and
without your _vse_required_set mechanism, default vsym would be made
live at STMT2. I understand your effort is in trying to prevent
performance regression due to your fix.
Strictly speaking, default vsym is live at STMT2, because it aliases
with t2 which is used at the return statement. If this incorrect
liveness of the default vsym does not cause any wrong optimization, then
it is still OK. Maybe you should consider using a different approach to
fix the bug. For example, you can leave opt_dse.cxx intact and consider
preventing the bad copy propagation instead. At PathScale, we did not
do copy propagation on chi occurrences, which I think is the root cause
of the bug.
Fred
Regards
Shi Hui
Fred
Regards
Shi Hui
On Tue, Jul 5, 2011 at 5:16 PM, Fred Chow <frdc...@gmail.com
<mailto:frdc...@gmail.com>> wrote:
Hi Shi,
Your fix by calling CODEMAP rename is OK in its own right.
But this fix is in reality for patching up a bug that was
introduced by copy propagation. We should fix the root cause
of the problem, NOT to patch it so the symptom does not appear.
Basically, the following from your original mail:
STMT1
sym2v4 = chi(sym2v3) NOT LIVE
default_vsym_v4 = chi(default_vsym_v3) LIVE
is already wrong. sym2v aliases with the default vsym.
Thus, one cannot be live while the other is dead.
In looking at the code in opt_dse.cxx, I found that it has
diverged a lot from the version that I'm familiar with.
Thus, I'm not in a position to point out what the root
problem is. At least the version I worked with at PathScale
does not have this problem. I would welcome others to
provide input on this.
Fred
On 07/04/2011 07:37 PM, Hui Shi wrote:
Hi Fred,
Thanks for your comments.
The reanming happens only when there are some dead phi/chi
nodes revive in CODEMAP::Iload_folded.
To evaluate how this patch will affect compile-time, I add
dump when this renaming happens.
Tested with SPEC2006 base/peak
Following cases have CODEMAP rename after create CODEMAPE,
and renaming count is seldom consider test's PU count..
447.dealII base/peak rename 20+ times
483.xalancbmk base/peak rename 30+ times
416.gamess peak rename 4 times
458.sjeng base rename 1 time.
So this renaming rarely happens and I think we can afford
the increased compile time.
And from another perspective. Rename_CODEMAP is already used
in other phase like aft CGF optimization,
COMP_UNIT::Fold_lda_iload_istore, etc.
These optimizations and renaming are more likely happen.
One utiIity I hope WOPT can have is CODEMAP rename only
rename somve AUX (give in a table or list). Then rename cost
can be decreased a lot.
For copy propagation issue.
The copy propagation happens in Dead_Store_Elminination,
DSE::Propagate_vsym_bb.
To disable the propagation, need change
Dead_Store_Elimination. If one chi node is live,
all other chi nodes binding to same statment and
set_required should live because they are aliased with each
other.
This will also cause compile time isuee, because more live
phi/chi node in codrep.
Another issue is though we fix the copy propgation problem,
the revive logical is still there in CODEMAP::Iload_folded.
There still possible dead phi/chi nodes revive in CODEMAP
creation. So complete fix is rename after create CODEMAP.
The main issue is create_codemap not simply create coderep,
it also perform iload folding may change version.
If iload folding function is totolay peformed in
COMP_UNIT::Fold_lda_iload_istore, it has no problem.
Regards
Shi Hui
On Mon, Jul 4, 2011 at 9:26 PM, Fred Chow <frdc...@gmail.com
<mailto:frdc...@gmail.com>> wrote:
Doing renaming is quite compile-time expensive, because
it is like running the SSA construction algorithm. The
problem your mentioned would not have occurred if sym2v3
has not been propagated down to STMT3. I don't recall
we were doing copy propagation for chi operands. The
overlapped live range was caused by this bad copy
propagation, NOT by the ILOAD folding.
The above copy propagation is bad because sym2v4 is NOT
dead, because it aliases with the default vsym which
appears in the mu of STMT2. So your fix should be in
preventing the copy propagation.
Fred
On 06/29/2011 11:35 PM, Hui Shi wrote:
Could gatekeeper help review this fix?
In SSA::Create_CODEMAP ILOAD folding is performed
during coderep setup.
The problem is iload folding may revive some dead
phi/chi node.
For example
STMT1
sym2v4 = chi(sym2v3) NOT LIVE
default_vsym_v4 = chi(default_vsym_v3) LIVE
....
STMT2
ILOAD mu(default_vsym_v4)
...
STMT3
sym2v5 = chi(sym2v3) LIVE // sym2v3 is propagate down
from above dead chi.
default_vsym_v5 = chi(default_vsym_v4)
If sym2v4 may use at STMT2, ILOAD's MU node must on
STMT1's chi list
(because MU node must alias with sym2, otherwise its
alias issue).
And MU node's corresponding CHI node must live on
STMT1, because set_required_mu will set chi node live.
So it's okay for DSE to mark fisrt CHI dead if STMT1
and STMT3 write same memory location.
1. It will not make STMT1 dead, because the other chi
node is live.
2. ILOAD's MU node's def statment still STMT1
If ILOAD folding transform ILOAD to LDID and LDID's AUX
is sym2.
ILOD_FOLDING revives dead chi sym2v4 = chi(sym2v3)
without update the late chi node's opnd.
This introduce overlapped between sym2v3 and sym2v4.
Fix is rename CODEMAP if iload folding revive dead
phi/chi node.
OPT_REVISE_SSA::Fold_lda_indirects also rename CODEMAP
after optimization.
Fix is
1. record in htable, if dead chi/phi node is revived in
SSA::Create_CODEMAP.
This only happends when ILOAD_Folding happens and
find corresponding chi/phi is dead.
2. After SSA::Create_CODEMAP, rename coderep if dead
chi/phi node is revive in CODEMAP setup.
Different with fixing in DSE to mark first chi node
live, this fix not affect DSE, so not affect performance.
And this is verified with spec2006.rate on xeon.
Regards
Shi Hui
------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously
valuable.
Why? It contains a definitive record of application performance,
security
threats, fraudulent activity, and more. Splunk takes this data and
makes
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
_______________________________________________
Open64-devel mailing list
Open64-devel@lists.sourceforge.net
<mailto:Open64-devel@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/open64-devel
------------------------------------------------------------------------------
Magic Quadrant for Content-Aware Data Loss Prevention
Research study explores the data loss prevention market. Includes in-depth
analysis on the changes within the DLP market, and the criteria used to
evaluate the strengths and weaknesses of these DLP solutions.
http://www.accelacomm.com/jaw/sfnl/114/51385063/
_______________________________________________
Open64-devel mailing list
Open64-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/open64-devel