Hi Feng,

     The code is OK.  But you can further remove need_cvt by merging the 
two if's as:

     if (new_cr->Dtyp() != opnd->Dtyp()) {
              opc = OPCODE_make_op(OPR_CVT, new_cr->Dtyp(), opnd->Dtyp());
              cvt_cr = Htable()->Add_unary_node(opc, opnd);
              new_cr->Set_opnd(index, cvt_cr);
              need_rehash = TRUE;
      }

Fred

On 03/03/2011 05:06 PM, Feng Zhou wrote:
> Thank you very much for the suggestion. The updated patch is attached.
>
> I also changed the update code. Instead of updating cr, I now update
> new_cr, because looks like to me that anytime we need to modify cr, we
> copy it to new_cr and then update new_cr and rehash it.
>
> -- Feng
>
>
>
> On Tue, Mar 1, 2011 at 11:37 PM, Fred Chow<frdc...@gmail.com>  wrote:
>> Hi Feng,
>>
>> You need to also set need_rehash to TRUE because, with the insertion of CVT
>> to some of its kids, the node being worked on now looks like a different
>> expression tree.
>>
>> Also, the two cases:
>>
>> +          if (cr->Dsctyp() != MTYPE_V&&
>> +                  cr->Dsctyp() != opnd->Dtyp()) {
>> +              opc = OPCODE_make_op(OPR_CVT, cr->Dsctyp(), opnd->Dtyp());
>> +              need_cvt = TRUE;
>> +          } else if (cr->Dtyp() != opnd->Dtyp()) {
>> +              opc = OPCODE_make_op(OPR_CVT, cr->Dtyp(), opnd->Dtyp());
>> +              need_cvt = TRUE;
>> +          }
>>
>> can be just
>>
>> +          if (cr->Dtyp() != opnd->Dtyp()) {
>> +              opc = OPCODE_make_op(OPR_CVT, cr->Dtyp(), opnd->Dtyp());
>> +              need_cvt = TRUE;
>> +          }
>>
>> because for those operators, Dsctyp() and Dtyp() are guaranteed the same.
>> This is not a bug, but just to make the code simpler.  Thanks.
>>
>> Fred
>>
>> On 02/28/2011 09:09 AM, Feng Zhou wrote:
>>> Hello, all
>>>
>>> I have attached an updated fix to bug #544. This fix tried to address
>>> what Fred pointed out. Can some gatekeeper review it for me please?
>>> Thank you very much.
>>>
>>> -- Feng
>>>
>>>
>>>
>>> On Wed, Feb 16, 2011 at 4:45 PM, Feng Zhou<fengzho...@gmail.com>    wrote:
>>>> Thanks for the suggestion. I will update the fix and send it for code
>>>> review again when it is ready.
>>>>
>>>> --Feng
>>>>
>>>>
>>>>
>>>> On Tue, Feb 15, 2011 at 10:58 PM, Fred Chow<frdc...@gmail.com>    wrote:
>>>>> Hi Feng,
>>>>>
>>>>> Your fix prevents the good optimization to avoid the assertion.  This is
>>>>> a
>>>>> good optimization because changing the 64-bit operation to 32-bit helps
>>>>> performance under -m32, because 64-bit operations are simulated by pairs
>>>>> of
>>>>> 32-bit operations.
>>>>>
>>>>> I think the better fix (without preventing the good optimization) is to
>>>>> also
>>>>> fix the size of the operands by inserting U4U8CVT for both operands.
>>>>>
>>>>> Fred Chow
>>>>>
>>>>> On 02/15/2011 09:31 AM, Feng Zhou wrote:
>>>>>
>>>>> Hello, all
>>>>>
>>>>> Can gatekeeper review the patch to bug #544 for me please? Thank you.
>>>>>
>>>>> Bug #544 is caused by bitwise dead code elimniation (BDCE) in WOPT. It
>>>>> happens with -m32 -O2 (or -O3 but not -O0). The test case looks like
>>>>> follows:
>>>>>
>>>>> int sy(unsigned long a)
>>>>> {
>>>>>     unsigned long j4;
>>>>>     long tmp;
>>>>>     j4=a+(a&0x5555555555555555)>>0x1;
>>>>>     return j4&0x44;
>>>>> }
>>>>>
>>>>> Before BDCE, the expression a&0x5555555555555555 is converted into:
>>>>>
>>>>>       LDID U8 U8 sym5v2 50 ty=502<u=2 cr17>    flags:0x0 b=-1
>>>>>       LDC I8 6148914691236517205<u=0 cr5>    flags:0x0 b=-1
>>>>>      I8BAND<u=1 cr18>    isop_flags:0xc0040 flags:0x1 b=E1
>>>>>
>>>>> Now, the expression j4&0x44 causes BDCE to mark only the 2nd and 6th
>>>>> bit being live for j4. Going backward,
>>>>> j4=(a+a&0x5555555555555555)>>0x1 causes BDCE think only the first byte
>>>>> of a&0x5555555555555555 is live. Based on this information, BDCE think
>>>>> we can use converted the I8BAND (with both operand type and result
>>>>> type being 64-bit long) to I4BAND (with both operand type and result
>>>>> type being 32-bit long). However, in this case, both of the result
>>>>> type of I8BAND's children can not be shortened to 32-bit long. So, we
>>>>> can not do this I8BAND ->    I4BAND transformation in this case.
>>>>>
>>>>> -- Feng
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------------------
>>>>> The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio
>>>>> XE:
>>>>> Pinpoint memory and threading errors before they happen.
>>>>> Find and fix more than 250 security defects in the development cycle.
>>>>> Locate bottlenecks in serial and parallel code that limit performance.
>>>>> http://p.sf.net/sfu/intel-dev2devfeb
>>>>>
>>>>> _______________________________________________
>>>>> Open64-devel mailing list
>>>>> Open64-devel@lists.sourceforge.net
>>>>> https://lists.sourceforge.net/lists/listinfo/open64-devel
>>>>>
>>>>>
>>


------------------------------------------------------------------------------
What You Don't Know About Data Connectivity CAN Hurt You
This paper provides an overview of data connectivity, details
its effect on application quality, and explores various alternative
solutions. http://p.sf.net/sfu/progress-d2d
_______________________________________________
Open64-devel mailing list
Open64-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/open64-devel

Reply via email to