Hi, Thanks for doing this.
Martin Jambor <mjam...@suse.cz> writes: > +/* Analyze function body scan results stored in param_accesses and > + param_accesses, detect possible transformations and store information of > + those in function summary. NODE, FUN and IFS are all various structures > + describing the currently analyzed function. */ > + > +static void > +process_scan_results (cgraph_node *node, struct function *fun, > + isra_func_summary *ifs, > + vec<gensum_param_desc> *param_descriptions) > +{ > + bool check_pass_throughs = false; > + bool dereferences_propagated = false; > + tree parm = DECL_ARGUMENTS (node->decl); > + unsigned param_count = param_descriptions->length(); > + > + for (unsigned desc_index = 0; > + desc_index < param_count; > + desc_index++, parm = DECL_CHAIN (parm)) > + { > + gensum_param_desc *desc = &(*param_descriptions)[desc_index]; > + if (!desc->locally_unused && !desc->split_candidate) > + continue; I'm jumping in the middle without working through the whole pass, so this is probably a daft question sorry, but: what is this loop required to do when: !desc->split_candidate && desc->locally_unused ? AFAICT... > + > + if (flag_checking) > + isra_verify_access_tree (desc->accesses); > + > + if (!dereferences_propagated > + && desc->by_ref > + && desc->accesses) > + { > + propagate_dereference_distances (fun); > + dereferences_propagated = true; > + } > + > + HOST_WIDE_INT nonarg_acc_size = 0; > + bool only_calls = true; > + bool check_failed = false; > + > + int entry_bb_index = ENTRY_BLOCK_PTR_FOR_FN (fun)->index; > + for (gensum_param_access *acc = desc->accesses; > + acc; > + acc = acc->next_sibling) > + if (check_gensum_access (parm, desc, acc, &nonarg_acc_size, &only_calls, > + entry_bb_index)) > + { > + check_failed = true; > + break; > + } > + if (check_failed) > + continue; > + > + if (only_calls) > + desc->locally_unused = true; > + > + HOST_WIDE_INT cur_param_size > + = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (parm))); > + HOST_WIDE_INT param_size_limit; > + if (!desc->by_ref || optimize_function_for_size_p (fun)) > + param_size_limit = cur_param_size; > + else > + param_size_limit = (PARAM_VALUE (PARAM_IPA_SRA_PTR_GROWTH_FACTOR) > + * cur_param_size); > + if (nonarg_acc_size > param_size_limit > + || (!desc->by_ref && nonarg_acc_size == param_size_limit)) > + { > + disqualify_split_candidate (desc, "Would result into a too big set of" > + "replacements."); > + } > + else > + { > + /* create_parameter_descriptors makes sure unit sizes of all > + candidate parameters fit unsigned integers restricted to > + ISRA_ARG_SIZE_LIMIT. */ > + desc->param_size_limit = param_size_limit / BITS_PER_UNIT; > + desc->nonarg_acc_size = nonarg_acc_size / BITS_PER_UNIT; > + if (desc->split_candidate && desc->ptr_pt_count) > + { > + gcc_assert (desc->by_ref); > + check_pass_throughs = true; > + } > + } > + } ...disqualify_split_candidate should be a no-op in that case, because we've already disqualified the parameter for a different reason. So it looks like the main effect is instead to set up param_size_limit and nonarg_acc_size, the latter of which I assume is 0 when desc->locally_unused. The reason for asking is that the final "else" says that we've already checked that param_size_limit is in range, but that's only true if desc->split_candidate. In particular: if (is_gimple_reg (parm) && !isra_track_scalar_param_local_uses (fun, node, parm, num, &scalar_call_uses)) { if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, " is a scalar with only %i call uses\n", scalar_call_uses); desc->locally_unused = true; desc->call_uses = scalar_call_uses; } happens before: if (type_size == 0 || type_size >= ISRA_ARG_SIZE_LIMIT) { if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, " not a candidate, has zero or huge size\n"); continue; } An uninteresting case in which we truncate the value is: typedef unsigned int big_vec __attribute__((vector_size (0x20000))); void foo (big_vec x) {} and going further triggers an assert in the tree_to_uhwi (...): typedef unsigned int omg_vec __attribute__((vector_size (1ULL << 61))); void foo (omg_vec x) {} but as you can guess, SVE is the real reason I'm asking. ;-) FWIW, I tried changing it to: gensum_param_desc *desc = &(*param_descriptions)[desc_index]; if (!desc->split_candidate) continue; and didn't get any test regressions (on aarch64-linux-gnu). Thanks, Richard