Thank you very much for the comment. I have updated the fix as
attached. Let me know if you have any problems.

-- Feng



On Thu, Mar 3, 2011 at 11:20 PM, Fred Chow <frdc...@gmail.com> wrote:
> Hi Feng,
>
>    The code is OK.  But you can further remove need_cvt by merging the two
> if's as:
>
>    if (new_cr->Dtyp() != opnd->Dtyp()) {
>             opc = OPCODE_make_op(OPR_CVT, new_cr->Dtyp(), opnd->Dtyp());
>             cvt_cr = Htable()->Add_unary_node(opc, opnd);
>             new_cr->Set_opnd(index, cvt_cr);
>             need_rehash = TRUE;
>     }
>
> Fred
>
> On 03/03/2011 05:06 PM, Feng Zhou wrote:
>>
>> Thank you very much for the suggestion. The updated patch is attached.
>>
>> I also changed the update code. Instead of updating cr, I now update
>> new_cr, because looks like to me that anytime we need to modify cr, we
>> copy it to new_cr and then update new_cr and rehash it.
>>
>> -- Feng
>>
>>
>>
>> On Tue, Mar 1, 2011 at 11:37 PM, Fred Chow<frdc...@gmail.com>  wrote:
>>>
>>> Hi Feng,
>>>
>>> You need to also set need_rehash to TRUE because, with the insertion of
>>> CVT
>>> to some of its kids, the node being worked on now looks like a different
>>> expression tree.
>>>
>>> Also, the two cases:
>>>
>>> +          if (cr->Dsctyp() != MTYPE_V&&
>>> +                  cr->Dsctyp() != opnd->Dtyp()) {
>>> +              opc = OPCODE_make_op(OPR_CVT, cr->Dsctyp(), opnd->Dtyp());
>>> +              need_cvt = TRUE;
>>> +          } else if (cr->Dtyp() != opnd->Dtyp()) {
>>> +              opc = OPCODE_make_op(OPR_CVT, cr->Dtyp(), opnd->Dtyp());
>>> +              need_cvt = TRUE;
>>> +          }
>>>
>>> can be just
>>>
>>> +          if (cr->Dtyp() != opnd->Dtyp()) {
>>> +              opc = OPCODE_make_op(OPR_CVT, cr->Dtyp(), opnd->Dtyp());
>>> +              need_cvt = TRUE;
>>> +          }
>>>
>>> because for those operators, Dsctyp() and Dtyp() are guaranteed the same.
>>> This is not a bug, but just to make the code simpler.  Thanks.
>>>
>>> Fred
>>>
>>> On 02/28/2011 09:09 AM, Feng Zhou wrote:
>>>>
>>>> Hello, all
>>>>
>>>> I have attached an updated fix to bug #544. This fix tried to address
>>>> what Fred pointed out. Can some gatekeeper review it for me please?
>>>> Thank you very much.
>>>>
>>>> -- Feng
>>>>
>>>>
>>>>
>>>> On Wed, Feb 16, 2011 at 4:45 PM, Feng Zhou<fengzho...@gmail.com>
>>>>  wrote:
>>>>>
>>>>> Thanks for the suggestion. I will update the fix and send it for code
>>>>> review again when it is ready.
>>>>>
>>>>> --Feng
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Feb 15, 2011 at 10:58 PM, Fred Chow<frdc...@gmail.com>
>>>>>  wrote:
>>>>>>
>>>>>> Hi Feng,
>>>>>>
>>>>>> Your fix prevents the good optimization to avoid the assertion.  This
>>>>>> is
>>>>>> a
>>>>>> good optimization because changing the 64-bit operation to 32-bit
>>>>>> helps
>>>>>> performance under -m32, because 64-bit operations are simulated by
>>>>>> pairs
>>>>>> of
>>>>>> 32-bit operations.
>>>>>>
>>>>>> I think the better fix (without preventing the good optimization) is
>>>>>> to
>>>>>> also
>>>>>> fix the size of the operands by inserting U4U8CVT for both operands.
>>>>>>
>>>>>> Fred Chow
>>>>>>
>>>>>> On 02/15/2011 09:31 AM, Feng Zhou wrote:
>>>>>>
>>>>>> Hello, all
>>>>>>
>>>>>> Can gatekeeper review the patch to bug #544 for me please? Thank you.
>>>>>>
>>>>>> Bug #544 is caused by bitwise dead code elimniation (BDCE) in WOPT. It
>>>>>> happens with -m32 -O2 (or -O3 but not -O0). The test case looks like
>>>>>> follows:
>>>>>>
>>>>>> int sy(unsigned long a)
>>>>>> {
>>>>>>    unsigned long j4;
>>>>>>    long tmp;
>>>>>>    j4=a+(a&0x5555555555555555)>>0x1;
>>>>>>    return j4&0x44;
>>>>>> }
>>>>>>
>>>>>> Before BDCE, the expression a&0x5555555555555555 is converted into:
>>>>>>
>>>>>>      LDID U8 U8 sym5v2 50 ty=502<u=2 cr17>    flags:0x0 b=-1
>>>>>>      LDC I8 6148914691236517205<u=0 cr5>    flags:0x0 b=-1
>>>>>>     I8BAND<u=1 cr18>    isop_flags:0xc0040 flags:0x1 b=E1
>>>>>>
>>>>>> Now, the expression j4&0x44 causes BDCE to mark only the 2nd and 6th
>>>>>> bit being live for j4. Going backward,
>>>>>> j4=(a+a&0x5555555555555555)>>0x1 causes BDCE think only the first byte
>>>>>> of a&0x5555555555555555 is live. Based on this information, BDCE think
>>>>>> we can use converted the I8BAND (with both operand type and result
>>>>>> type being 64-bit long) to I4BAND (with both operand type and result
>>>>>> type being 32-bit long). However, in this case, both of the result
>>>>>> type of I8BAND's children can not be shortened to 32-bit long. So, we
>>>>>> can not do this I8BAND ->    I4BAND transformation in this case.
>>>>>>
>>>>>> -- Feng
>>>>>>
>>>>>>
>>>>>>
>>>>>> ------------------------------------------------------------------------------
>>>>>> The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio
>>>>>> XE:
>>>>>> Pinpoint memory and threading errors before they happen.
>>>>>> Find and fix more than 250 security defects in the development cycle.
>>>>>> Locate bottlenecks in serial and parallel code that limit performance.
>>>>>> http://p.sf.net/sfu/intel-dev2devfeb
>>>>>>
>>>>>> _______________________________________________
>>>>>> Open64-devel mailing list
>>>>>> Open64-devel@lists.sourceforge.net
>>>>>> https://lists.sourceforge.net/lists/listinfo/open64-devel
>>>>>>
>>>>>>
>>>
>
>

Attachment: bug544_v8.patch
Description: Binary data

------------------------------------------------------------------------------
What You Don't Know About Data Connectivity CAN Hurt You
This paper provides an overview of data connectivity, details
its effect on application quality, and explores various alternative
solutions. http://p.sf.net/sfu/progress-d2d
_______________________________________________
Open64-devel mailing list
Open64-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/open64-devel

Reply via email to