I meant for sse2. These instructions are present in sse4 and vectorization
should work in the corresponding target.
Regards,
Shivaram
From: Rao, Shivarama [mailto:shivarama....@amd.com]
Sent: Friday, January 18, 2013 4:44 PM
To: Huan Luo; Das, Dibyendu; open64 mailing list
Subject: Re: [Open64-devel] How to dump the ir tree after vectorization?
Hi,
There are no simd multiply instructions which take two INT4 operands and
produces INT4 result. That is the reason this loop is not getting vectorized.
It will get vectorized if you change the array types to short/float/double.
You can dump the internal structure after LNO using -Wb,-trLNO option.
Regards,
Shivaram
From: Huan Luo [mailto:luo_huan...@126.com]<mailto:[mailto:luo_huan...@126.com]>
Sent: Friday, January 18, 2013 6:21 AM
To: Das, Dibyendu; open64 mailing list
Subject: Re: [Open64-devel] How to dump the ir tree after vectorization?
Hi,
There seems to be a problem. Open64 cannot vectorize the loop...
$ opencc -O3 -msse2 -ffast-math -LNO:simd=2 -LNO:simd_verbose=ON fun.c
And the output is:
(fun.c:6) Expression rooted at op "OPC_I4MPY"(line 7) is not vectorizable. Loop
was not vectorized.
I don't see why it cannot vectorize such a simple loop.
fun.c
==========================
int i, j;
int a[1024], b[1024], c[1024];
main()
{
for (i=0; i<1024; i++) {
a[i]=a[i]+b[i]*c[i];
}
}
==========================
--
Best wishes.
Huan Luo
在 2013-01-17 17:07:13,"Das, Dibyendu"
<dibyendu....@amd.com<mailto:dibyendu....@amd.com>> 写道:
Try -LNO:simd_verbose=ON
From: Huan Luo [mailto:luo_huan...@126.com<mailto:luo_huan...@126.com>]
Sent: Thursday, January 17, 2013 2:31 PM
To: Das, Dibyendu
Subject: Re:RE: [Open64-devel] How to dump the ir tree after vectorization?
Thank you very much. That works.
And any idea on how to dump the internal structure after vectorization?
--
Best wishes.
Huan Luo
在 2013-01-17 16:45:58,"Das, Dibyendu"
<dibyendu....@amd.com<mailto:dibyendu....@amd.com>> 写道:
At -O3 and above you should be able to use -LNO:simd=1,2,3 for vectorization.
-dibyendu
From: Huan Luo [mailto:luo_huan...@126.com<mailto:luo_huan...@126.com>]
Sent: Thursday, January 17, 2013 12:56 PM
To: open64 mailing list
Subject: [Open64-devel] How to dump the ir tree after vectorization?
Hi,
Lately we've been trying to use open64 to vectorize our
program and dump the tree for later analysis. But we are not
sure which option is available. Could anybody tell us?
A similar purpose can be achieved by gcc. For example:
fun.c
===========================================
int i, j;
int a[1024], b[1024], c[1024];
main()
{
for (i=0; i<100; i++) {
a[i]=a[i]+b[i]*c[i];
}
}
============================================
We use the option -ftree-vectorize to apply vectorization, and
-fdump-tree-uncprop to dump the functions after vectorization.
the command is:
gcc -O -dA -msse2 -ffast-math -ftree-vectorize -fdump-tree-uncprop fun.c
and the file looks like this:
fun.c.136t.uncprop
============================================
;; Function main (main) (executed once)
main ()
{
long unsigned int ivtmp.30;
vector(4) int vect_var_.21;
vector(4) int vect_var_.20;
vector(4) int vect_var_.19;
vector(4) int vect_var_.14;
vector(4) int vect_var_.9;
<bb 2>:
<bb 3>:
# ivtmp.30_15 = PHI <ivtmp.30_7(3), 0(2)>
vect_var_.9_18 = MEM[symbol: a, index: ivtmp.30_15, offset: 0B];
vect_var_.14_22 = MEM[symbol: b, index: ivtmp.30_15, offset: 0B];
vect_var_.19_26 = MEM[symbol: c, index: ivtmp.30_15, offset: 0B];
vect_var_.20_27 = vect_var_.14_22 * vect_var_.19_26;
vect_var_.21_28 = vect_var_.9_18 + vect_var_.20_27;
MEM[symbol: a, index: ivtmp.30_15, offset: 0B] = vect_var_.21_28;
ivtmp.30_7 = ivtmp.30_15 + 16;
if (ivtmp.30_7 != 400)
goto <bb 3>;
else
goto <bb 4>;
<bb 4>:
i = 100;
return;
}
============================================
However, gcc is not enough for analysis. So I wonder if there is a way to do it
in open64
and maybe it will provide more convenience than gcc.
Your help would be greatly appreciated.
--
Best wishes.
Huan Luo
------------------------------------------------------------------------------
Master HTML5, CSS3, ASP.NET, MVC, AJAX, Knockout.js, Web API and
much more. Get web development skills now with LearnDevNow -
350+ hours of step-by-step video tutorials by Microsoft MVPs and experts.
SALE $99.99 this month only -- learn more at:
http://p.sf.net/sfu/learnmore_122812
_______________________________________________
Open64-devel mailing list
Open64-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/open64-devel