Some additional analysis based on ruiling's comment. The second load from p(to calculate res2) may or may not be issued. It depends on whether there are some side effect instructions between res1 and res2's assignment. For example, if there is a store instruction or there is a barrier, the second load will be issued and you will see two loads for the same pointer in the final instruction stream.
As to the a * b + c*d, it will always be optimized and be reused when calculate for res2 which means at the res2 assignment it will only generate one add instruction to add the 1 to the previous calculated value. On Thu, Feb 12, 2015 at 08:56:57AM +0000, Song, Ruiling wrote: > It should not read global memory again. We already enable such kind of > optimization pass in LLVM. > And (a*b+c*d) should not calculate again. This is common-subexpression. Clang > should do it easily. But I am not quite sure whether clang is affected by -O2 > or -O0. Anyone know details? > > To check specific kernel. You may need to ‘export > OCL_OUTPUT_LLVM_AFTER_GEN=1’ and build your program again to get the LLVM IR. > > From: Beignet [mailto:[email protected]] On Behalf Of 彭席汉 > Sent: Thursday, February 12, 2015 4:40 PM > To: [email protected] > Subject: [Beignet] a question about default optimize option when building > > Hi: > > My CL kernel program looks like as follow: > > __global unsigned char *p; > int a, b, c, d; > > res1 = *p * (a*b + c*d); > > <some code here> > > res2 = *p * (a*b + c*d + 1); > > > If I use default build option, for res2, what will EU do? read global memory > for pointer p again and do computing of "a*b + c*d" again? > _______________________________________________ > Beignet mailing list > [email protected] > http://lists.freedesktop.org/mailman/listinfo/beignet _______________________________________________ Beignet mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/beignet
