This patch attempts to reduce compile time for locality cloning pass by 
reducing recursive calls to partition_callchain ().  This is achieved by 
precomputing caller callee information into locality_info.
locality_info stores all callees of a node, either directly or via inlined 
nodes thereby avoiding calls to partition_callchain () for inlined nodes which 
are already partitioned with their inlined_to nodes.
locality_info also stores precomputed accumulated incoming edge frequencies per 
unique caller and avoids repeated computation within partition_callchain ().

Approximately 45% compile time improvement is observed for 
bootstrap-lto-locality config, and takes 2-5% more time on top of bootstrap-lto.

This patch also handles appropriate memory management of pass specific data 
structures.

Bootstrapped and tested on aarch64-none-linux-gnu.
Ok for mainline?

Thanks,
Prachi

Signed-off-by: Prachi Godbole 

config/ChangeLog:

        * bootstrap-lto-locality.mk (STAGE2_CFLAGS): Add param
        lto-max-locality-partition.
        (STAGE3_CFLAGS): Ditto.
        (STAGEprofile_CFLAGS): Remove -fipa-reorder-for-locality.
        (STAGEtrain_CFLAGS): Ditto.

gcc/ChangeLog:

        * ipa-locality-cloning.cc (struct locality_info): New struct.
        (loc_infos): Ditto.
        (get_locality_info): New function.
        (populate_callee_locality_info): Ditto.
        (populate_caller_locality_info): Ditto.
        (create_locality_info): Ditto.
        (adjust_recursive_callees): Access node_to_clone by reference.
        (inline_clones): Access node_to_clone and clone_to_node by reference.
        (clone_node_as_needed): Ditto.
        (accumulate_incoming_edge_frequency): Remove function.
        (clone_node_p): New function.
        (partition_callchain): Change prototype.
        (locality_determine_ipa_order): Call create_locality_info ().
        (locality_determine_static_order): Ditto.
        (locality_partition_and_clone): Update call to partition_callchain ()
        according prototype.
        (lc_execute): Allocate and free node_to_ch_info, node_to_clone,
        clone_to_node.

Attachment: 0001-Patch-Address-compile-time-issues-for-locality-cloni.patch
Description: 0001-Patch-Address-compile-time-issues-for-locality-cloni.patch

Reply via email to