http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45199
Sebastian Pop <spop at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |ASSIGNED
--- Comment #4 from Sebastian Pop <spop at gcc dot gnu.org> 2010-11-30 23:08:20
UTC ---
The fix for this one is to disable a heuristic that aggregates writes to the
same array into a same partition:
diff --git a/gcc/tree-loop-distribution.c b/gcc/tree-loop-distribution.c
index 007c4f3..2c2af2c 100644
--- a/gcc/tree-loop-distribution.c
+++ b/gcc/tree-loop-distribution.c
@@ -781,8 +781,9 @@ build_rdg_partition_for_component (struct graph *rdg, rdgc
c,
and determine those vertices that have some memory affinity with
the current nodes in the component: these are stores to the same
arrays, i.e. we're taking care of cache locality. */
- rdg_flag_similar_memory_accesses (rdg, partition, loops, processed,
- other_stores);
+ if (!flag_tree_loop_distribute_patterns)
+ rdg_flag_similar_memory_accesses (rdg, partition, loops, processed,
+ other_stores);
rdg_flag_loop_exits (rdg, loops, partition, processed, part_has_writes);
With this patch on the testcase of this PR I get the following code generated:
# .MEM_54 = VDEF <.MEM_62(D)>
__builtin_memset (&i_otyp, 0, 4000);
# .MEM_2 = VDEF <.MEM_54>
__builtin_memset (&i_styp, 0, 4000);
# .MEM_78 = VDEF <.MEM_2>
__builtin_memset (&l_numob, 0, 4000);
# .MEM_82 = VDEF <.MEM_78>
__builtin_memset (&i_otyp[1000], 0, 4000);
# .MEM_83 = VDEF <.MEM_82>
__builtin_memset (&i_styp[1000], 0, 4000);
# .MEM_89 = VDEF <.MEM_83>
__builtin_memset (&l_numob[1000], 0, 4000);
# .MEM_95 = VDEF <.MEM_89>
__builtin_memset (&i_otyp[2000], 0, 4000);
# .MEM_103 = VDEF <.MEM_95>
__builtin_memset (&i_styp[2000], 0, 4000);
# .MEM_104 = VDEF <.MEM_103>
__builtin_memset (&l_numob[2000], 0, 4000);
Note that, for example, i_otyp is written several times, and all these writes
end up in the same loop partition with the heuristic, disabling even the
memset (0) pattern recognition.