Thanks for the suggestion. I will update the fix and send it for code review again when it is ready.
--Feng On Tue, Feb 15, 2011 at 10:58 PM, Fred Chow <frdc...@gmail.com> wrote: > Hi Feng, > > Your fix prevents the good optimization to avoid the assertion. This is a > good optimization because changing the 64-bit operation to 32-bit helps > performance under -m32, because 64-bit operations are simulated by pairs of > 32-bit operations. > > I think the better fix (without preventing the good optimization) is to also > fix the size of the operands by inserting U4U8CVT for both operands. > > Fred Chow > > On 02/15/2011 09:31 AM, Feng Zhou wrote: > > Hello, all > > Can gatekeeper review the patch to bug #544 for me please? Thank you. > > Bug #544 is caused by bitwise dead code elimniation (BDCE) in WOPT. It > happens with -m32 -O2 (or -O3 but not -O0). The test case looks like > follows: > > int sy(unsigned long a) > { > unsigned long j4; > long tmp; > j4=a+(a&0x5555555555555555)>>0x1; > return j4&0x44; > } > > Before BDCE, the expression a&0x5555555555555555 is converted into: > > LDID U8 U8 sym5v2 50 ty=502 <u=2 cr17> flags:0x0 b=-1 > LDC I8 6148914691236517205 <u=0 cr5> flags:0x0 b=-1 > I8BAND <u=1 cr18> isop_flags:0xc0040 flags:0x1 b=E1 > > Now, the expression j4&0x44 causes BDCE to mark only the 2nd and 6th > bit being live for j4. Going backward, > j4=(a+a&0x5555555555555555)>>0x1 causes BDCE think only the first byte > of a&0x5555555555555555 is live. Based on this information, BDCE think > we can use converted the I8BAND (with both operand type and result > type being 64-bit long) to I4BAND (with both operand type and result > type being 32-bit long). However, in this case, both of the result > type of I8BAND's children can not be shortened to 32-bit long. So, we > can not do this I8BAND -> I4BAND transformation in this case. > > -- Feng > > ------------------------------------------------------------------------------ > The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE: > Pinpoint memory and threading errors before they happen. > Find and fix more than 250 security defects in the development cycle. > Locate bottlenecks in serial and parallel code that limit performance. > http://p.sf.net/sfu/intel-dev2devfeb > > _______________________________________________ > Open64-devel mailing list > Open64-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/open64-devel > > ------------------------------------------------------------------------------ The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE: Pinpoint memory and threading errors before they happen. Find and fix more than 250 security defects in the development cycle. Locate bottlenecks in serial and parallel code that limit performance. http://p.sf.net/sfu/intel-dev2devfeb _______________________________________________ Open64-devel mailing list Open64-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/open64-devel