Re: [google][4.6] Bug fixes to function reordering linker plugin to handle local and comdat functions. (issue 5851044)

2012-03-21 Thread tmsriram

Committed to google/gcc-4_6 after validation.

On 2012/03/21 05:07:33, davidxl wrote:

ok for google branches after checkin validation.



David




http://codereview.appspot.com/5851044/


Re: [google][4.6] Bug fixes to function reordering linker plugin to handle local and comdat functions. (issue 5851044)

2012-03-20 Thread tmsriram

Uploaded new patch.

On 2012/03/20 19:25:38, davidxl wrote:

It would be nice to add some unit/regression test cases of some sort.


Made the existing unit test case check the final layout.



David



http://codereview.appspot.com/5851044/diff/1/callgraph.c
File callgraph.c (right):



http://codereview.appspot.com/5851044/diff/1/callgraph.c#newcode309
callgraph.c:309: if (!is_prefix_of (_ZL, name))
How about static functions in namespace? How about functions in

anonymous

namespace?


Thanks for pointing this out. One solution is to add more plugin
interfaces to plugin-api.h to find the section flags and the section
group. This way, comdats can be detected. All other duplicates can be
treated as file static functions. I marked this as TODO for now.

I am also thinking of sending a patch to generate unique section names
for file static functions. This will help --section-ordering-file in
gold too.


http://codereview.appspot.com/5851044/diff/1/callgraph.c#newcode511
callgraph.c:511: .text. };
How are the sections ordered in the array?  Keep it in mind that it is

possible

to encode the actual profile count of the function in the section name

in the

future.


Right, for now this parsing will work. The parsing needs to be updated
once the section names change.



http://codereview.appspot.com/5851044/


Re: [google][4.6] Bug fixes to function reordering linker plugin to handle local and comdat functions. (issue 5851044)

2012-03-20 Thread tmsriram

On 2012/03/20 19:26:26, eraman wrote:

http://codereview.appspot.com/5851044/diff/1/callgraph.c
File callgraph.c (right):



http://codereview.appspot.com/5851044/diff/1/callgraph.c#newcode513
callgraph.c:513: const int section_priority[] = {0, 3, 4, 2, 1};
Add a comment about section_priority


Done.



http://codereview.appspot.com/5851044/diff/1/callgraph.c#newcode571
callgraph.c:571: if (section_priority[kept-section_type]
Add an example that shows why we want to do that


I added a comment.



http://codereview.appspot.com/5851044/diff/1/callgraph.c#newcode655
callgraph.c:655: write_out_node (n_it-name, section_start[0],
section_end[0]);
In write_out_node, why take the function name and do a hash table

lookup to get

the section, instead of directly passing Section_id * in the caller.

In all

calls to write_out_node, you are in fact getting the name from the

Section_id *.

That is not true, I pass the Node * in the first two calls. What you
said holds for the other calls. So, I made this more efficient.



http://codereview.appspot.com/5851044/diff/1/callgraph.c#newcode674
callgraph.c:674: s_it-processed = 1;
setting processed to 1 is redundant as it is already done in

write_out_node.

Done.



http://codereview.appspot.com/5851044/diff/1/callgraph.h
File callgraph.h (right):



http://codereview.appspot.com/5851044/diff/1/callgraph.h#newcode31
callgraph.h:31: void push_mm_ptr (void *ptr);
push_allocated_ptr or save_allocated_ptr would be a better name.



Done.


http://codereview.appspot.com/5851044/diff/1/callgraph.h#newcode48
callgraph.h:48: push_mm_ptr (list);
It might be cleaner to create a wrapper around XNEW that calls

push_mm_ptr after

XNEW. Similar for malloc as well.


Done.

http://codereview.appspot.com/5851044/


Re: [google][4.6]Add new target builtin to check for amdfam15h processors (issue 5535046)

2012-01-11 Thread tmsriram

New patch uploaded with the changes.


* i386.c (IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER1): New enum value.
(IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER2): New enum value.
(IX86_BUILTIN_CPU_IS_AMDFAM10H): Rename IX86_BUILTIN_CPU_IS_AMDFAM10.
(IX86_BUILTIN_CPU_IS_AMDFAM10H_BARCELONA): Rename
IX86_BUILTIN_CPU_IS_AMDFAM10_BARCELONA.
(IX86_BUILTIN_CPU_IS_AMDFAM10H_SHANGHAI): Rename
IX86_BUILTIN_CPU_IS_AMDFAM10_SHANGHAI.
(IX86_BUILTIN_CPU_IS_AMDFAM10H_ISTANBUL): Rename
IX86_BUILTIN_CPU_IS_AMDFAM10_ISTANBUL.
(fold_builtin_cpu): Process IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER1
and IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER2.
(M_AMDFAM10H): Rename M_AMDFAM10.
(M_AMDFAM10H_BARCELONA): Rename M_AMDFAM10_BARCELONA.
(M_AMDFAM10H_SHANGHAI): Rename M_AMDFAM10_SHANGHAI.
(M_AMDFAM10H_ISTANBUL): Rename M_AMDFAM10_ISTANBUL.
(ix86_init_platform_type_builtins): Make new builtins
__builtin_cpu_is_amdfam15_bdver1
and __builtin_cpu_is_amdfam15_bdver2.
(ix86_expand_builtin): Expand IX86_BUILTIN_CPU_IS_AMDFAM15_BDVER1 and
IX86_BUILTIN_CPU_IS_AMDFAM15_BDVER2.
* testsuite/gcc.target/builtin_target.c (fn1): Call new builtins.

* i386-cpuinfo.c (__cpu_is_amdfam15h_bdver1): New member in __cpu_model
struct.
(__cpu_is_amdfam15h_bdver2): New member in __cpu_model struct.
(__cpu_model): Rename __cpu_is_amdfam10 to __cpu_is_amdfam10h.
Rename __cpu_is_amdfam10_barcelona to __cpu_is_amdfam10h_barcelona.
Rename __cpu_is_amdfam10_shanghai to __cpu_is_amdfam10h_shanghai.
Rename __cpu_is_amdfam10_istanbul to __cpu_is_amdfam10h_istanbul.
(get_amd_cpu): Check for family 15h processors.
(cpu_indicator_init): Adjust model and family for AMD processors.
Refactor code.


Thanks,
-Sri.

On 2012/01/11 00:05:01, davidxl wrote:

http://codereview.appspot.com/5535046/diff/1/gcc/config/i386/i386.c
File gcc/config/i386/i386.c (right):



http://codereview.appspot.com/5535046/diff/1/gcc/config/i386/i386.c#newcode26032

gcc/config/i386/i386.c:26032: +M_AMDFAM15,
Maybe better to change 10 to 10H, and 15 to 15H in the name as the

number is

hex.



Why not split the enum for family 15h into M_AMDFAM15H_BDVER1 and

.._BVER2?


http://codereview.appspot.com/5535046/diff/1/libgcc/config/i386/i386-cpuinfo.c

File libgcc/config/i386/i386-cpuinfo.c (right):



http://codereview.appspot.com/5535046/diff/1/libgcc/config/i386/i386-cpuinfo.c#newcode102

libgcc/config/i386/i386-cpuinfo.c:102: break;
No family15h model encoding?




http://codereview.appspot.com/5535046/


Re: [google][4.6]Compiler Directed Multiversioning with new -mvarch option (issue 5490054)

2011-12-16 Thread tmsriram

I have uploaded a new patch set with all the mentioned changes made. If
a function has the target attribute it will not be touched by the
autoclone pass.

Also fixed some test cases which were broken because the clone names
used '_' instead of '.' for suffixing.

On 2011/12/16 19:39:47, davidxl wrote:

http://codereview.appspot.com/5490054/diff/1011/config/i386/i386.c
File config/i386/i386.c (right):



http://codereview.appspot.com/5490054/diff/1011/config/i386/i386.c#newcode26569

config/i386/i386.c:26569: +mversion_for_core2 (tree

*optimization_node,

- mversionable_for_core2_p ?



http://codereview.appspot.com/5490054/diff/1011/mversn-dispatch.c
File mversn-dispatch.c (right):



http://codereview.appspot.com/5490054/diff/1011/mversn-dispatch.c#newcode931

mversn-dispatch.c:931: DECL_STATIC_DESTRUCTOR (new_decl) = 0;
Should you assert it instead? Should not clone ctor/dtors.



http://codereview.appspot.com/5490054/diff/1011/mversn-dispatch.c#newcode2221

mversn-dispatch.c:2221: VEC_truncate (edge, EXIT_BLOCK_PTR-preds, 0);
{} -- remove



http://codereview.appspot.com/5490054/diff/1011/mversn-dispatch.c#newcode2389

mversn-dispatch.c:2389:
How does it interact with manual multi-versioning from user? You

probably don't

want to clone functions that are marked with target attributes

(explicitly --

not implied from command line).




http://codereview.appspot.com/5490054/


Re: [4.6][google]Compiler Directed Function Multiversioning (issue 5477048)

2011-12-15 Thread tmsriram

Reviewers: davidxl,

Message:
I have uploaded a new patch. I did not split the patch into two because
I felt the ifunc dispatch patch does nothing as a standalone. The
framework now will do aggressive cloning and it will mark the clones
specialized for core2 with -mtune=core2. I have already submitted
another patch that will optimize vectorization for core2.

I also tested Teresa's RAT stalls avoidance in my MV framework and it
works great. I removed that from the current patch as I thought that
should go in as a separate patch.

Pending: Test case.

On 2011/12/09 07:49:57, davidxl wrote:

It might be better to split this into two patches:



1) ifunc dispatch patch
2) auto cloning patch



The auto cloning framework is good in general, the main problem

potential

disconnection of early cloning decisions and later transformations.



the logic of deciding if loops are vectorizable is not the same as

that used in

the actual vectorization pass which can lead to either false positives
(unnecessary clones) and false negatives (missing clones of

vectorizable loops).

The false negatives are entirely possible because later

transformations such as

if-cvt can expose more opportunities.  To alleviate the problem, it

would be

more powerful if the framework allows more aggressive (speculative)

cloning, and

undo some of the cloning later when they turn out to be not needed.

This is

possible with some book keeping.



David



http://codereview.appspot.com/5477048/diff/1/config/i386/i386.c
File config/i386/i386.c (right):



http://codereview.appspot.com/5477048/diff/1/config/i386/i386.c#newcode26331

config/i386/i386.c:26331: +
THe -- The.



http://codereview.appspot.com/5477048/diff/1/config/i386/i386.c#newcode26365

config/i386/i386.c:26365: +   tree-vect-stmts.cc.  The loop is not

vectorizable

even if a single stmt is not
even if?



http://codereview.appspot.com/5477048/diff/1/config/i386/i386.c#newcode26367

config/i386/i386.c:26367: +
You need a good explanation here why a common utility function can not

be used,

but 'duplicate' the check here.



http://codereview.appspot.com/5477048/diff/1/config/i386/i386.c#newcode26524

config/i386/i386.c:26524: +  if (ix86_mv_arch_string
Using hardcoded target name here is not ideal -- better encode this

attribute in

the target/processor model array. For now better add a comment here.



http://codereview.appspot.com/5477048/diff/1/config/i386/i386.c#newcode26544

config/i386/i386.c:26544: +  if (*cond_func_decl == NULL)
Add a small example in the comment here. Also better use a helper

function.


http://codereview.appspot.com/5477048/diff/1/mversn-dispatch.c
File mversn-dispatch.c (right):



http://codereview.appspot.com/5477048/diff/1/mversn-dispatch.c#newcode2330

mversn-dispatch.c:2330: static unsigned int do_auto_clone (void)
Start a new line



http://codereview.appspot.com/5477048/diff/1/mversn-dispatch.c#newcode2371

mversn-dispatch.c:2371: return 0;
Fix indentation



http://codereview.appspot.com/5477048/diff/1/mversn-dispatch.c#newcode2404

mversn-dispatch.c:2404: NULL, fn_ver_addr_chain);
indentation



http://codereview.appspot.com/5477048/diff/1/mversn-dispatch.c#newcode2412

mversn-dispatch.c:2412: /* The current function is replaced by a ifunc

call to

the right version.
s/a/an/



http://codereview.appspot.com/5477048/diff/1/mversn-dispatch.c#newcode2424

mversn-dispatch.c:2424: cond_func_addr, fn_ver_addr_chain,
Indentation



http://codereview.appspot.com/5477048/diff/1/mversn-dispatch.c#newcode2431

mversn-dispatch.c:2431: gimple_set_bb (gsi_stmt (gsi), empty_bb);
Fix indentation.






Please review this at http://codereview.appspot.com/5477048/

Affected files:
  M config/i386/i386.c
  M config/i386/i386.opt
  M doc/tm.texi
  M doc/tm.texi.in
  M mversn-dispatch.c
  M params.def
  M passes.c
  M target.def
  M tree-pass.h




Re: [PATCH i386][google]With -mtune=core2, avoid generating the slow unaligned vector load/store (issue 5488054)

2011-12-13 Thread tmsriram

On 2011/12/14 00:04:10, davidxl wrote:

http://codereview.appspot.com/5488054/diff/4002/tree-vect-stmts.c
File tree-vect-stmts.c (right):



http://codereview.appspot.com/5488054/diff/4002/tree-vect-stmts.c#newcode3712

tree-vect-stmts.c:3712: }
The check can be put into a helper function.


Fixed.

http://codereview.appspot.com/5488054/