Richard Biener <rguent...@suse.de> writes:
> The following refactors the vectorizer vector_costs target API
> to add a new vector_costs::add_vector_cost entry which groups
> all individual sub-stmts we create per "vector stmt", aka SLP
> node.  This allows for the targets to more easily match on
> complex cases like emulated gather/scatter or even just vector
> construction.
>
> The patch itself is just a prototype and leaves out BB vectorization
> for simplicity.  It also does not fully group all vector stmts
> but leaves some bare add_stmt_cost hook invocations.  I'd expect
> the add_stmt_hook to be still used for scalar stmt costing and
> for costing added branching around prologue/epilogue.  The
> default implementation of add_vector_cost just dispatches to
> add_stmt_cost for individual stmts.  Eventually the actual data
> we track for the combined costing will diverge (no need to track
> SLP node or stmt_info there?), so targets would eventually be
> expected to implement both hooks and splice out common workers
> to deal with "missing" information coming in from the different
> entries.
>
> This should eventually baby-step us towards the generic vectorizer
> code being able to compute and compare latency and resource
> utilization throughout the scalar / vector loop iteration based
> on latency and throughput data determined on a stmt-by-stmt base
> from the target.  As given the grouping should be an incremental
> improvement, but I have not tried to see how it can simplify
> the x86 hook implementation - I've been triggered by the aarch64
> reported bootstrap fail on the cleanup RFC I posted given that
> code wants to identify a scalar load that's costed as part of
> a gather/scatter operation.
>
> Any comments or problems you forsee?

Could the stmt_vector_for_cost pointer instead be passed to
TARGET_VECTORIZE_CREATE_COSTS?  The danger with passing it to
add_vector_cost is that the same vector_costs instance might get used
for multiple different costing attempts, so that only the provided
stmt_vector_for_costs are specific to the current costing attempt.
But for complex cases, the target's vector_costs should be able
to cache its own target-specific information, with the same
lifetime/scope as the stmt_vector_for_costs.

Thanks,
Richard

Reply via email to