Hi,

sorry for replying so late, I still haven't recovered from two weeks of
traveling and conferences.

On Sat, Sep 21 2019, Richard Sandiford wrote:
> Hi,
>
> Thanks for doing this.
>
> Martin Jambor <mjam...@suse.cz> writes:
>> +/* Analyze function body scan results stored in param_accesses and
>> +   param_accesses, detect possible transformations and store information of
>> +   those in function summary.  NODE, FUN and IFS are all various structures
>> +   describing the currently analyzed function.  */
>> +
>> +static void
>> +process_scan_results (cgraph_node *node, struct function *fun,
>> +                  isra_func_summary *ifs,
>> +                  vec<gensum_param_desc> *param_descriptions)
>> +{
>> +  bool check_pass_throughs = false;
>> +  bool dereferences_propagated = false;
>> +  tree parm = DECL_ARGUMENTS (node->decl);
>> +  unsigned param_count = param_descriptions->length();
>> +
>> +  for (unsigned desc_index = 0;
>> +       desc_index < param_count;
>> +       desc_index++, parm = DECL_CHAIN (parm))
>> +    {
>> +      gensum_param_desc *desc = &(*param_descriptions)[desc_index];
>> +      if (!desc->locally_unused && !desc->split_candidate)
>> +    continue;
>
> I'm jumping in the middle without working through the whole pass,
> so this is probably a daft question sorry, but: what is this loop
> required to do when:
>
>   !desc->split_candidate && desc->locally_unused

You have figured out correctly that this is a thinko.  I meant not to
continue for non-register-types which might not be used locally but
their locally_unused flag is only set a few lines below...

>
> ?  AFAICT...
>
>> +
>> +      if (flag_checking)
>> +    isra_verify_access_tree (desc->accesses);
>> +
>> +      if (!dereferences_propagated
>> +      && desc->by_ref
>> +      && desc->accesses)
>> +    {
>> +      propagate_dereference_distances (fun);
>> +      dereferences_propagated = true;
>> +    }
>> +
>> +      HOST_WIDE_INT nonarg_acc_size = 0;
>> +      bool only_calls = true;
>> +      bool check_failed = false;
>> +
>> +      int entry_bb_index = ENTRY_BLOCK_PTR_FOR_FN (fun)->index;
>> +      for (gensum_param_access *acc = desc->accesses;
>> +       acc;
>> +       acc = acc->next_sibling)
>> +    if (check_gensum_access (parm, desc, acc, &nonarg_acc_size, &only_calls,
>> +                             entry_bb_index))
>> +      {
>> +        check_failed = true;
>> +        break;
>> +      }
>> +      if (check_failed)
>> +    continue;
>> +
>> +      if (only_calls)
>> +    desc->locally_unused = true;

...specifically here.

>> +
>> +      HOST_WIDE_INT cur_param_size
>> +    = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (parm)));
>> +      HOST_WIDE_INT param_size_limit;
>> +      if (!desc->by_ref || optimize_function_for_size_p (fun))
>> +    param_size_limit = cur_param_size;
>> +      else
>> +    param_size_limit = (PARAM_VALUE (PARAM_IPA_SRA_PTR_GROWTH_FACTOR)
>> +                       * cur_param_size);
>> +      if (nonarg_acc_size > param_size_limit
>> +      || (!desc->by_ref && nonarg_acc_size == param_size_limit))
>> +    {
>> +      disqualify_split_candidate (desc, "Would result into a too big set of"
>> +                                  "replacements.");
>> +    }
>> +      else
>> +    {
>> +      /* create_parameter_descriptors makes sure unit sizes of all
>> +         candidate parameters fit unsigned integers restricted to
>> +         ISRA_ARG_SIZE_LIMIT.  */
>> +      desc->param_size_limit = param_size_limit / BITS_PER_UNIT;
>> +      desc->nonarg_acc_size = nonarg_acc_size / BITS_PER_UNIT;
>> +      if (desc->split_candidate && desc->ptr_pt_count)
>> +        {
>> +          gcc_assert (desc->by_ref);
>> +          check_pass_throughs = true;
>> +        }
>> +    }
>> +    }
>
> ...disqualify_split_candidate should be a no-op in that case,
> because we've already disqualified the parameter for a different reason.
> So it looks like the main effect is instead to set up param_size_limit
> and nonarg_acc_size, the latter of which I assume is 0 when
> desc->locally_unused.

This is the only bit where you are wrong, param_size_limit is the type
size for aggregates and twice pointer size for pointers (well, actually
PARAM_IPA_SRA_PTR_GROWTH_FACTOR times the size of a pointer).  Even for
locally unused parameters because we might "pull" some of them from
callees, when there are some.  But it is not really relevant for the
problem you are facing.

>
> The reason for asking is that the final "else" says that we've already
> checked that param_size_limit is in range, but that's only true if
> desc->split_candidate.  In particular:
>
>       if (is_gimple_reg (parm)
>         && !isra_track_scalar_param_local_uses (fun, node, parm, num,
>                                                 &scalar_call_uses))
>       {
>         if (dump_file && (dump_flags & TDF_DETAILS))
>           fprintf (dump_file, " is a scalar with only %i call uses\n",
>                    scalar_call_uses);
>
>         desc->locally_unused = true;
>         desc->call_uses = scalar_call_uses;
>       }
>
> happens before:
>
>       if (type_size == 0
>         || type_size >= ISRA_ARG_SIZE_LIMIT)
>       {
>         if (dump_file && (dump_flags & TDF_DETAILS))
>           fprintf (dump_file, " not a candidate, has zero or huge size\n");
>         continue;
>       }
>
> An uninteresting case in which we truncate the value is:
>
>     typedef unsigned int big_vec __attribute__((vector_size (0x20000)));
>     void foo (big_vec x) {}
>
> and going further triggers an assert in the tree_to_uhwi (...):
>
>     typedef unsigned int omg_vec __attribute__((vector_size (1ULL << 61)));
>     void foo (omg_vec x) {}
>
> but as you can guess, SVE is the real reason I'm asking. ;-)
>
> FWIW, I tried changing it to:
>
>       gensum_param_desc *desc = &(*param_descriptions)[desc_index];
>       if (!desc->split_candidate)
>       continue;
>
> and didn't get any test regressions (on aarch64-linux-gnu).


It is the correct thing to do, sorry for the breakage.  I have to run
now but will prepare a patch tomorrow.

Thanks,

Martin

Reply via email to