Re: [PATCH] rs6000: Stackoverflow in optimized code on PPC (PR100799)

2024-03-22 Thread Ajit Agarwal
Hello Jakub:

Addressed the below comments and sent version 1 of the patch
for review.

Thanks & Regards
Ajit

On 22/03/24 1:15 pm, Jakub Jelinek wrote:
> On Fri, Mar 22, 2024 at 01:00:21PM +0530, Ajit Agarwal wrote:
>> When using FlexiBLAS with OpenBLAS we noticed corruption of
>> the parameters passed to OpenBLAS functions. FlexiBLAS
>> basically provides a BLAS interface where each function
>> is a stub that forwards the arguments to a real BLAS lib,
>> like OpenBLAS.
>>
>> Fixes the corruption of caller frame checking number of
>> arguments is less than equal to GP_ARG_NUM_REG (8)
>> excluding hidden unused DECLS.
> 
> Thanks for working on this.
> 
>> 2024-03-22  Ajit Kumar Agarwal  
>>
>> gcc/ChangeLog:
>>
>> PR rtk-optimization/100799
>> * config/rs600/rs600-calls.cc (rs6000_function_arg): Don't
> 
> These 2 lines are 8 space indented rather than tab.
> 
>>  generate parameter save area if number of arguments passed
>>  less than equal to GP_ARG_NUM_REG (8) excluding hidden
>>  paramter.
>>  * function.cc (assign_parms_initialize_all): Check for hidden
>>  parameter in fortran code and set the flag hidden_string_length
>>  and actual paramter passed excluding hidden unused DECLS.
> 
> s/paramter/parameter/
> 
>>  * function.h: Add new field hidden_string_length and
>>  actual_parm_length in function structure.
> 
> Why do you need to change generic code for something that will only be
> used by a single target?
> I mean, why don't you add the extra members in rs6000.h (struct rs6000_args)
> and initialize them in rs6000-call.cc (init_cumulative_args) -
> the function.cc function you've modified is the only one which uses
> INIT_CUMULATIVE_INCOMING_ARGS and in that case init_cumulative_args is
> called with incoming == true, so move the stuff from function.cc there.
> 
>> --- a/gcc/config/rs6000/rs6000-call.cc
>> +++ b/gcc/config/rs6000/rs6000-call.cc
>> @@ -1857,7 +1857,16 @@ rs6000_function_arg (cumulative_args_t cum_v, const 
>> function_arg_info )
>>  
>>return rs6000_finish_function_arg (mode, rvec, k);
>>  }
>> -  else if (align_words < GP_ARG_NUM_REG)
>> + /* Workaround buggy C/C++ wrappers around Fortran routines with
>> +character(len=constant) arguments if the hidden string length arguments
>> +are passed on the stack; if the callers forget to pass those arguments,
>> +attempting to tail call in such routines leads to stack corruption.
>> +Avoid return stack space for parameters <= 8 excluding hidden string
>> +length argument is passed (partially or fully) on the stack in the
>> +caller and the callee needs to pass any arguments on the stack.  */
>> +  else if (align_words < GP_ARG_NUM_REG
>> +   || (cfun->hidden_string_length
>> +   && cfun->actual_parm_length <= GP_ARG_NUM_REG))
>>  {
>>if (TARGET_32BIT && TARGET_POWERPC64)
>>  return rs6000_mixed_function_arg (mode, type, align_words);
>> diff --git a/gcc/function.cc b/gcc/function.cc
>> index 3cef6c17bce..1318564b466 100644
>> --- a/gcc/function.cc
>> +++ b/gcc/function.cc
>> @@ -2326,6 +2326,32 @@ assign_parms_initialize_all (struct 
>> assign_parm_data_all *all)
>>  #endif
>>all->args_so_far = pack_cumulative_args (>args_so_far_v);
>>  
>> +  unsigned int num_args = 0;
>> +  unsigned int hidden_length = 0;
>> +
>> +  /* Workaround buggy C/C++ wrappers around Fortran routines with
>> + character(len=constant) arguments if the hidden string length arguments
>> + are passed on the stack; if the callers forget to pass those arguments,
>> + attempting to tail call in such routines leads to stack corruption.
>> + Avoid return stack space for parameters <= 8 excluding hidden string
>> + length argument is passed (partially or fully) on the stack in the
>> + caller and the callee needs to pass any arguments on the stack.  */
>> +  for (tree arg = DECL_ARGUMENTS (current_function_decl);
>> +   arg; arg = DECL_CHAIN (arg))
>> +{
>> +  num_args++;
>> +  if (DECL_HIDDEN_STRING_LENGTH (arg))
>> +{
>> +  tree parmdef = ssa_default_def (cfun, arg);
>> +  if (parmdef == NULL || has_zero_uses (parmdef))
>> +{
>> +  cfun->hidden_string_length = 1;
>> +  hidden_length++;
>> +}
>> +}
>> +   }
>> +
>> +  cfun->actual_parm_length = num_args - hidden_length;
>>  #ifdef INCOMING_REG_PARM_STACK_SPACE
>>all->reg_parm_stack_space
>>  = INCOMING_REG_PARM_STACK_SPACE (current_function_decl);
>> diff --git a/gcc/function.h b/gcc/function.h
>> index 19e15bd63b0..5984f0007c2 100644
>> --- a/gcc/function.h
>> +++ b/gcc/function.h
>> @@ -346,6 +346,11 @@ struct GTY(()) function {
>>/* Last assigned dependence info clique.  */
>>unsigned short last_clique;
>>  
>> +  /* Actual parameter length ignoring hidden paramter.
>> + This is done to C++ wrapper calling fortran module
>> + which has hidden parameter that are not used. 

Re: [PATCH] rs6000: Stackoverflow in optimized code on PPC (PR100799)

2024-03-22 Thread Jakub Jelinek
On Fri, Mar 22, 2024 at 01:00:21PM +0530, Ajit Agarwal wrote:
> When using FlexiBLAS with OpenBLAS we noticed corruption of
> the parameters passed to OpenBLAS functions. FlexiBLAS
> basically provides a BLAS interface where each function
> is a stub that forwards the arguments to a real BLAS lib,
> like OpenBLAS.
> 
> Fixes the corruption of caller frame checking number of
> arguments is less than equal to GP_ARG_NUM_REG (8)
> excluding hidden unused DECLS.

Thanks for working on this.

> 2024-03-22  Ajit Kumar Agarwal  
> 
> gcc/ChangeLog:
> 
> PR rtk-optimization/100799
> * config/rs600/rs600-calls.cc (rs6000_function_arg): Don't

These 2 lines are 8 space indented rather than tab.

>   generate parameter save area if number of arguments passed
>   less than equal to GP_ARG_NUM_REG (8) excluding hidden
>   paramter.
>   * function.cc (assign_parms_initialize_all): Check for hidden
>   parameter in fortran code and set the flag hidden_string_length
>   and actual paramter passed excluding hidden unused DECLS.

s/paramter/parameter/

>   * function.h: Add new field hidden_string_length and
>   actual_parm_length in function structure.

Why do you need to change generic code for something that will only be
used by a single target?
I mean, why don't you add the extra members in rs6000.h (struct rs6000_args)
and initialize them in rs6000-call.cc (init_cumulative_args) -
the function.cc function you've modified is the only one which uses
INIT_CUMULATIVE_INCOMING_ARGS and in that case init_cumulative_args is
called with incoming == true, so move the stuff from function.cc there.

> --- a/gcc/config/rs6000/rs6000-call.cc
> +++ b/gcc/config/rs6000/rs6000-call.cc
> @@ -1857,7 +1857,16 @@ rs6000_function_arg (cumulative_args_t cum_v, const 
> function_arg_info )
>  
> return rs6000_finish_function_arg (mode, rvec, k);
>   }
> -  else if (align_words < GP_ARG_NUM_REG)
> + /* Workaround buggy C/C++ wrappers around Fortran routines with
> + character(len=constant) arguments if the hidden string length arguments
> + are passed on the stack; if the callers forget to pass those arguments,
> + attempting to tail call in such routines leads to stack corruption.
> + Avoid return stack space for parameters <= 8 excluding hidden string
> + length argument is passed (partially or fully) on the stack in the
> + caller and the callee needs to pass any arguments on the stack.  */
> +  else if (align_words < GP_ARG_NUM_REG
> +|| (cfun->hidden_string_length
> +&& cfun->actual_parm_length <= GP_ARG_NUM_REG))
>   {
> if (TARGET_32BIT && TARGET_POWERPC64)
>   return rs6000_mixed_function_arg (mode, type, align_words);
> diff --git a/gcc/function.cc b/gcc/function.cc
> index 3cef6c17bce..1318564b466 100644
> --- a/gcc/function.cc
> +++ b/gcc/function.cc
> @@ -2326,6 +2326,32 @@ assign_parms_initialize_all (struct 
> assign_parm_data_all *all)
>  #endif
>all->args_so_far = pack_cumulative_args (>args_so_far_v);
>  
> +  unsigned int num_args = 0;
> +  unsigned int hidden_length = 0;
> +
> +  /* Workaround buggy C/C++ wrappers around Fortran routines with
> + character(len=constant) arguments if the hidden string length arguments
> + are passed on the stack; if the callers forget to pass those arguments,
> + attempting to tail call in such routines leads to stack corruption.
> + Avoid return stack space for parameters <= 8 excluding hidden string
> + length argument is passed (partially or fully) on the stack in the
> + caller and the callee needs to pass any arguments on the stack.  */
> +  for (tree arg = DECL_ARGUMENTS (current_function_decl);
> +   arg; arg = DECL_CHAIN (arg))
> +{
> +  num_args++;
> +  if (DECL_HIDDEN_STRING_LENGTH (arg))
> + {
> +   tree parmdef = ssa_default_def (cfun, arg);
> +   if (parmdef == NULL || has_zero_uses (parmdef))
> + {
> +   cfun->hidden_string_length = 1;
> +   hidden_length++;
> + }
> + }
> +   }
> +
> +  cfun->actual_parm_length = num_args - hidden_length;
>  #ifdef INCOMING_REG_PARM_STACK_SPACE
>all->reg_parm_stack_space
>  = INCOMING_REG_PARM_STACK_SPACE (current_function_decl);
> diff --git a/gcc/function.h b/gcc/function.h
> index 19e15bd63b0..5984f0007c2 100644
> --- a/gcc/function.h
> +++ b/gcc/function.h
> @@ -346,6 +346,11 @@ struct GTY(()) function {
>/* Last assigned dependence info clique.  */
>unsigned short last_clique;
>  
> +  /* Actual parameter length ignoring hidden paramter.
> + This is done to C++ wrapper calling fortran module
> + which has hidden parameter that are not used.  */
> +  unsigned int actual_parm_length;
> +
>/* Collected bit flags.  */
>  
>/* Number of units of general registers that need saving in stdarg
> @@ -442,6 +447,11 @@ struct GTY(()) function {
>/* Set for artificial function created for