> On Oct 26, 2020, at 3:33 PM, Uros Bizjak <ubiz...@gmail.com> wrote:
> 
> On Mon, Oct 26, 2020 at 9:05 PM Uros Bizjak <ubiz...@gmail.com> wrote:
>> 
>> On Mon, Oct 26, 2020 at 8:10 PM Qing Zhao <qing.z...@oracle.com> wrote:
>>> 
>>> 
>>> 
>>>> On Oct 26, 2020, at 1:42 PM, Uros Bizjak <ubiz...@gmail.com> wrote:
>>>> 
>>>> On Mon, Oct 26, 2020 at 6:30 PM Qing Zhao <qing.z...@oracle.com> wrote:
>>>>> 
>>>>> 
>>>>> The following is the current change in i386.c, could you check whether 
>>>>> the logic is good?
>>>> 
>>>> x87 handling looks good to me.
>>>> 
>>>> One remaining question: If the function uses MMX regs (either
>>>> internally or as an argument register), but exits in x87 mode, does
>>>> your logic clear the x87 stack?
>>> 
>>> Yes but not completely yes.
>>> 
>>> FIRST, As following:
>>> 
>>>  /* Then, decide which mode (MMX mode or x87 mode) the function exit with.
>>>     In order to decide whether we need to clear the MMX registers or the
>>>     stack registers.  */
>>>  bool exit_with_mmx_mode = false;
>>> 
>>>  exit_with_mmx_mode = ((GET_CODE (crtl->return_rtx) == REG)
>>>                        && (MMX_REG_P (crtl->return_rtx)));
>>> 
>>>  /* then, let's see whether we can zero all st registers togeter.  */
>>>  if (!exit_with_mmx_mode)
>>>    st_zeroed = zero_all_st_registers (need_zeroed_hardregs);
>>> 
>>> 
>>> We first check whether this routine exit with mmx mode, if Not then it’s 
>>> X87 mode
>>> (at exit, “EMMS” should already been called per ABI), then
>>> The st/mm registers will be cleared as x87 stack registers.
>>> 
>>> However, within the routine “zero_all_st_registers”:
>>> 
>>> static bool
>>> zero_all_st_registers (HARD_REG_SET need_zeroed_hardregs)
>>> {
>>>  unsigned int num_of_st = 0;
>>>  for (unsigned int regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
>>>    if (STACK_REGNO_P (regno)
>>>        && TEST_HARD_REG_BIT (need_zeroed_hardregs, regno))
>>>      {
>>>        num_of_st++;
>>>        break;
>>>      }
>>> 
>>>  if (num_of_st == 0)
>>>    return false;
>>> 
>>> 
>>> In the above, I currently only check whether any “Stack” registers need to 
>>> be zeroed or not.
>>> But looks like we should also check any “MMX” register need to be zeroed or 
>>> not too. If there is any
>>> “MMX” register need to be zeroed, we still need to clear the whole X87 
>>> stack?
>> 
>> I think so, but I have to check the details.
> 
> Please compile the following testcase with "-m32 -mmmx":
> 
> --cut here--
> #include <stdio.h>
> 
> typedef int __v2si __attribute__ ((vector_size (8)));
> 
> __v2si zzz;
> 
> void
> __attribute__ ((noinline))
> mmx (__v2si a, __v2si b, __v2si c)
> {
>  __v2si res;
> 
>  res = __builtin_ia32_paddd (a, b);
>  zzz = __builtin_ia32_paddd (res, c);
> 
>  __builtin_ia32_emms ();
> }
> 
> 
> int main ()
> {
>  __v2si a = { 123, 345 };
>  __v2si b = { 234, 456 };
>  __v2si c = { 345, 567 };
> 
>  mmx (a, b, c);
> 
>  printf ("%i, %i\n", zzz[0], zzz[1]);
> 
>  return 0;
> }
> --cut here--
> 
> at the end of mmx() function:
> 
> 0x080491ed in mmx ()
> (gdb) disass
> Dump of assembler code for function mmx:
>  0x080491e0 <+0>:     paddd  %mm1,%mm0
>  0x080491e3 <+3>:     paddd  %mm2,%mm0
>  0x080491e6 <+6>:     movq   %mm0,0x804c020
> => 0x080491ed <+13>:    emms
>  0x080491ef <+15>:    ret
> End of assembler dump.
> (gdb) i r flo
> st0            <invalid float value> (raw 0xffff00000558000002be)
> st1            <invalid float value> (raw 0xffff000001c8000000ea)
> st2            <invalid float value> (raw 0xffff0000023700000159)
> st3            0                   (raw 0x00000000000000000000)
> st4            0                   (raw 0x00000000000000000000)
> st5            0                   (raw 0x00000000000000000000)
> st6            0                   (raw 0x00000000000000000000)
> st7            0                   (raw 0x00000000000000000000)
> fctrl          0x37f               895
> fstat          0x0                 0
> ftag           0x556a              21866
> fiseg          0x0                 0
> fioff          0x0                 0
> foseg          0x0                 0
> fooff          0x0                 0
> fop            0x0                 0
> 
> There are still values in the MMX registers. However, we are in x87
> mode, so the whole stack has to be cleared.

Yes. And I just tried, my current implementation behaved correctly. 
> 
> Now, what to do if the function uses x87 registers and exits in MMX
> mode? I guess we have to clear all MMX registers (modulo return value
> reg).

Need to add this part.

thanks.
Qing
> 
> Uros.

Reply via email to