Thank you very much for the comment. I have updated the fix as attached. Let me know if you have any problems.
-- Feng On Thu, Mar 3, 2011 at 11:20 PM, Fred Chow <frdc...@gmail.com> wrote: > Hi Feng, > > The code is OK. But you can further remove need_cvt by merging the two > if's as: > > if (new_cr->Dtyp() != opnd->Dtyp()) { > opc = OPCODE_make_op(OPR_CVT, new_cr->Dtyp(), opnd->Dtyp()); > cvt_cr = Htable()->Add_unary_node(opc, opnd); > new_cr->Set_opnd(index, cvt_cr); > need_rehash = TRUE; > } > > Fred > > On 03/03/2011 05:06 PM, Feng Zhou wrote: >> >> Thank you very much for the suggestion. The updated patch is attached. >> >> I also changed the update code. Instead of updating cr, I now update >> new_cr, because looks like to me that anytime we need to modify cr, we >> copy it to new_cr and then update new_cr and rehash it. >> >> -- Feng >> >> >> >> On Tue, Mar 1, 2011 at 11:37 PM, Fred Chow<frdc...@gmail.com> wrote: >>> >>> Hi Feng, >>> >>> You need to also set need_rehash to TRUE because, with the insertion of >>> CVT >>> to some of its kids, the node being worked on now looks like a different >>> expression tree. >>> >>> Also, the two cases: >>> >>> + if (cr->Dsctyp() != MTYPE_V&& >>> + cr->Dsctyp() != opnd->Dtyp()) { >>> + opc = OPCODE_make_op(OPR_CVT, cr->Dsctyp(), opnd->Dtyp()); >>> + need_cvt = TRUE; >>> + } else if (cr->Dtyp() != opnd->Dtyp()) { >>> + opc = OPCODE_make_op(OPR_CVT, cr->Dtyp(), opnd->Dtyp()); >>> + need_cvt = TRUE; >>> + } >>> >>> can be just >>> >>> + if (cr->Dtyp() != opnd->Dtyp()) { >>> + opc = OPCODE_make_op(OPR_CVT, cr->Dtyp(), opnd->Dtyp()); >>> + need_cvt = TRUE; >>> + } >>> >>> because for those operators, Dsctyp() and Dtyp() are guaranteed the same. >>> This is not a bug, but just to make the code simpler. Thanks. >>> >>> Fred >>> >>> On 02/28/2011 09:09 AM, Feng Zhou wrote: >>>> >>>> Hello, all >>>> >>>> I have attached an updated fix to bug #544. This fix tried to address >>>> what Fred pointed out. Can some gatekeeper review it for me please? >>>> Thank you very much. >>>> >>>> -- Feng >>>> >>>> >>>> >>>> On Wed, Feb 16, 2011 at 4:45 PM, Feng Zhou<fengzho...@gmail.com> >>>> wrote: >>>>> >>>>> Thanks for the suggestion. I will update the fix and send it for code >>>>> review again when it is ready. >>>>> >>>>> --Feng >>>>> >>>>> >>>>> >>>>> On Tue, Feb 15, 2011 at 10:58 PM, Fred Chow<frdc...@gmail.com> >>>>> wrote: >>>>>> >>>>>> Hi Feng, >>>>>> >>>>>> Your fix prevents the good optimization to avoid the assertion. This >>>>>> is >>>>>> a >>>>>> good optimization because changing the 64-bit operation to 32-bit >>>>>> helps >>>>>> performance under -m32, because 64-bit operations are simulated by >>>>>> pairs >>>>>> of >>>>>> 32-bit operations. >>>>>> >>>>>> I think the better fix (without preventing the good optimization) is >>>>>> to >>>>>> also >>>>>> fix the size of the operands by inserting U4U8CVT for both operands. >>>>>> >>>>>> Fred Chow >>>>>> >>>>>> On 02/15/2011 09:31 AM, Feng Zhou wrote: >>>>>> >>>>>> Hello, all >>>>>> >>>>>> Can gatekeeper review the patch to bug #544 for me please? Thank you. >>>>>> >>>>>> Bug #544 is caused by bitwise dead code elimniation (BDCE) in WOPT. It >>>>>> happens with -m32 -O2 (or -O3 but not -O0). The test case looks like >>>>>> follows: >>>>>> >>>>>> int sy(unsigned long a) >>>>>> { >>>>>> unsigned long j4; >>>>>> long tmp; >>>>>> j4=a+(a&0x5555555555555555)>>0x1; >>>>>> return j4&0x44; >>>>>> } >>>>>> >>>>>> Before BDCE, the expression a&0x5555555555555555 is converted into: >>>>>> >>>>>> LDID U8 U8 sym5v2 50 ty=502<u=2 cr17> flags:0x0 b=-1 >>>>>> LDC I8 6148914691236517205<u=0 cr5> flags:0x0 b=-1 >>>>>> I8BAND<u=1 cr18> isop_flags:0xc0040 flags:0x1 b=E1 >>>>>> >>>>>> Now, the expression j4&0x44 causes BDCE to mark only the 2nd and 6th >>>>>> bit being live for j4. Going backward, >>>>>> j4=(a+a&0x5555555555555555)>>0x1 causes BDCE think only the first byte >>>>>> of a&0x5555555555555555 is live. Based on this information, BDCE think >>>>>> we can use converted the I8BAND (with both operand type and result >>>>>> type being 64-bit long) to I4BAND (with both operand type and result >>>>>> type being 32-bit long). However, in this case, both of the result >>>>>> type of I8BAND's children can not be shortened to 32-bit long. So, we >>>>>> can not do this I8BAND -> I4BAND transformation in this case. >>>>>> >>>>>> -- Feng >>>>>> >>>>>> >>>>>> >>>>>> ------------------------------------------------------------------------------ >>>>>> The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio >>>>>> XE: >>>>>> Pinpoint memory and threading errors before they happen. >>>>>> Find and fix more than 250 security defects in the development cycle. >>>>>> Locate bottlenecks in serial and parallel code that limit performance. >>>>>> http://p.sf.net/sfu/intel-dev2devfeb >>>>>> >>>>>> _______________________________________________ >>>>>> Open64-devel mailing list >>>>>> Open64-devel@lists.sourceforge.net >>>>>> https://lists.sourceforge.net/lists/listinfo/open64-devel >>>>>> >>>>>> >>> > >
bug544_v8.patch
Description: Binary data
------------------------------------------------------------------------------ What You Don't Know About Data Connectivity CAN Hurt You This paper provides an overview of data connectivity, details its effect on application quality, and explores various alternative solutions. http://p.sf.net/sfu/progress-d2d
_______________________________________________ Open64-devel mailing list Open64-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/open64-devel