[google gcc-4_9] Fix bad LIPO profile produced by gcov-tool

2015-12-01 Thread Rong Xu
Hi,

This patch fixes the issue when using gcov-tool to merge LIPO profiles
after we compressing the module infomration . We should not decompress
the string as the compressed string should be written directly to the
profile later. Tested with some LIPO profiles.

Thanks,

-Rong
2015-12-01  Rong Xu  <x...@google.com>

* gcov-dump.c (tag_module_info): Dump string information.
* gcov-io.c (gcov_read_module_info): record combined_len
  and don't uncompress in gcov-tool.

Index: gcov-dump.c
===
--- gcov-dump.c (revision 231134)
+++ gcov-dump.c (working copy)
@@ -588,6 +588,11 @@ tag_module_info (const char *filename ATTRIBUTE_UN
 {
   if (!mod_info->is_primary)
printf ("%s\n", mod_info->source_filename);
+  unsigned short compressed_size = mod_info->combined_strlen;
+  unsigned short uncompressed_size = mod_info->combined_strlen>>16;
+  printf ("compressed_ strlen=%d uncompressed_strlen=%d String:\n",
+  compressed_size,uncompressed_size);
+  printf ("%s\n", mod_info->saved_cc1_strings);
 }
   else
 {
Index: gcov-io.c
===
--- gcov-io.c   (revision 231134)
+++ gcov-io.c   (working copy)
@@ -835,16 +835,18 @@ gcov_read_module_info (struct gcov_module_info *mo
   len -= (src_filename_len + 1);
 
   saved_compressed_len = (unsigned long) gcov_read_unsigned ();
-  saved_uncompressed_len  = saved_compressed_len >> 16;
-  saved_compressed_len &= 0x;
+  mod_info->combined_strlen = saved_compressed_len;
   tag_len = gcov_read_unsigned ();
   len -= (tag_len + 2);
   gcc_assert (!len);
   compressed_array = (char *) xmalloc (tag_len * sizeof (gcov_unsigned_t));
-  uncompressed_array = (char *) xmalloc (saved_uncompressed_len);
   for (i = 0; i < tag_len; i++)
 ((gcov_unsigned_t *) compressed_array)[i] = gcov_read_unsigned ();
 
+#if !defined (IN_GCOV_TOOL)
+  saved_uncompressed_len  = saved_compressed_len >> 16;
+  saved_compressed_len &= 0x;
+  uncompressed_array = (char *) xmalloc (saved_uncompressed_len);
   result_len = saved_uncompressed_len;
   uncompress ((Bytef *)uncompressed_array, _len,
   (const Bytef *)compressed_array, saved_compressed_len);
@@ -851,6 +853,9 @@ gcov_read_module_info (struct gcov_module_info *mo
   gcc_assert (result_len == saved_uncompressed_len);
   mod_info->saved_cc1_strings = uncompressed_array;
   free (compressed_array);
+#else /* IN_GCOV_TOOL: we don't need to uncompress. It's a pass through.  */
+  mod_info->saved_cc1_strings = compressed_array;
+#endif
 }
 #endif
 


Re: [google gcc-4_9] Fix bad LIPO profile produced by gcov-tool

2015-12-01 Thread Rong Xu
This is only needed for gcov-tool as we need to rewrite the
moduel-info to the profile (this is only used in decompress)

The transitional compiler path does not need it because the string is
already decompressed. It only needs to use the strings.

gcov-dump in theory does not need it either if it only dumps the
strings. But now I added the printing of both lengths. So now it is
also needed.

On Tue, Dec 1, 2015 at 4:46 PM, Xinliang David Li <davi...@google.com> wrote:
> Not sure about this line:
>
> mod_info->combined_strlen = saved_compressed_len;
>
> This did not exist for the compiler path before.
>
> On Tue, Dec 1, 2015 at 4:34 PM, Rong Xu <x...@google.com> wrote:
>>
>> Hi,
>>
>> This patch fixes the issue when using gcov-tool to merge LIPO profiles
>> after we compressing the module infomration . We should not decompress
>> the string as the compressed string should be written directly to the
>> profile later. Tested with some LIPO profiles.
>>
>> Thanks,
>>
>> -Rong
>
>


Re: [google][gcc-4_9] encode and compress cc1 option strings in gcov_module_info

2015-10-06 Thread Rong Xu
It's 1:3 to 1:4 in the programs I tested. But it really depends on how
the options are used. I think your idea of using combined strlen works
better.
I just make the code a little clumsy but it does not cause any
performance issue.

On Tue, Oct 6, 2015 at 10:21 AM, Xinliang David Li <davi...@google.com> wrote:
> On Tue, Oct 6, 2015 at 9:26 AM, Rong Xu <x...@google.com> wrote:
>> On Mon, Oct 5, 2015 at 5:33 PM, Xinliang David Li <davi...@google.com> wrote:
>>>unsigned ggc_memory = gcov_read_unsigned ();
>>> +  unsigned marker = 0, len = 0, k;
>>> +  char **string_array, *saved_cc1_strings;
>>> +
>>>for (unsigned j = 0; j < 7; j++)
>>>
>>>
>>> Do not use hard coded number. Use the enum defined in coverage.c.
>>
>> OK.
>>
>>>
>>>
>>> +string_array[j] = xstrdup (gcov_read_string ());
>>> +
>>> +  k = 0;
>>> +  for (unsigned j = 1; j < 7; j++)
>>>
>>> Do not use hard coded number.
>>
>> OK.
>>
>>>
>>>
>>> +{
>>> +  if (num_array[j] == 0)
>>> +continue;
>>> +  marker += num_array[j];
>>>
>>> It is better to read if the name of variable 'marker' is changed to
>>> 'j_end' or something similar
>>>
>>> For all the substrings of 'j' kind, there should be just one marker,
>>> right? It looks like here you introduce one marker per string, not one
>>> marker per string kind.
>>
>> I don't understand what you meant here. "marker" is fixed for each j
>> substring (one option kind) -- it the end index of the sub-string
>> array. k-loop is for each string.
>>
>
> That was a wrong comment from me. Discard it.
>
>>>
>>> +  len += 3; /* [[  */
>>>
>>> Same here for hard coded value.
>>>
>>> +  for (; k < marker; k++)
>>> +len += strlen (string_array[k]) + 1; /* 1 for delimter of ']'  
>>> */
>>>
>>> Why do we need one ']' per string?
>>
>> This is because the options strings can contain space '  '. I cannot
>> use space as the delimiter, neither is \0 as it is the end of the
>> string of the encoded string.
>
> Ok -- this allows you to avoid string copy during parsing.
>>
>>>
>>>
>>> +}
>>> +  saved_cc1_strings = (char *) xmalloc (len + 1);
>>> +  saved_cc1_strings[0] = 0;
>>> +
>>> +  marker = 0;
>>> +  k = 0;
>>> +  for (unsigned j = 1; j < 7; j++)
>>>
>>> Same here for 7.
>>
>> will fix in the new patch.
>>
>>>
>>> +{
>>> +  static const char lipo_string_flags[6] = {'Q', 'B', 'S',
>>> 'D','I', 'C'};
>>> +  if (num_array[j] == 0)
>>> +continue;
>>> +  marker += num_array[j];
>>>
>>> Suggest changing marker to j_end
>> OK.
>>
>>>
>>> +  sprintf (saved_cc1_strings, "%s[[%c", saved_cc1_strings,
>>> +   lipo_string_flags[j - 1]);
>>> +  for (; k < marker; k++)
>>> +{
>>> +  sprintf (saved_cc1_strings, "%s%s]", saved_cc1_strings,
>>> +   string_array[k]);
>>>
>>> +#define DELIMTER"[["
>>>
>>> Why double '[' ?
>> I will change to single '['.
>>
>>>
>>> +#define DELIMTER2   "]"
>>> +#define QUOTE_PATH_FLAG 'Q'
>>> +#define BRACKET_PATH_FLAG   'B'
>>> +#define SYSTEM_PATH_FLAG'S'
>>> +#define D_U_OPTION_FLAG 'D'
>>> +#define INCLUDE_OPTION_FLAG 'I'
>>> +#define COMMAND_ARG_FLAG'C'
>>> +
>>> +enum lipo_cc1_string_kind {
>>> +  k_quote_paths = 0,
>>> +  k_bracket_paths,
>>> +  k_system_paths,
>>> +  k_cpp_defines,
>>> +  k_cpp_includes,
>>> +  k_lipo_cl_args,
>>> +  num_lipo_cc1_string_kind
>>> +};
>>> +
>>> +struct lipo_parsed_cc1_string {
>>> +  const char* source_filename;
>>> +  unsigned num[num_lipo_cc1_string_kind];
>>> +  char **strings[num_lipo_cc1_string_kind];
>>> +};
>>> +
>>> +struct lipo_parsed_cc1_string *
>>> +lipo_parse_saved_cc1_string (const char *src, char *str,
>>> +bool parse_cl_args_only);
>>> +void free_parse

Re: [google][gcc-4_9] encode and compress cc1 option strings in gcov_module_info

2015-10-06 Thread Rong Xu
Here is the patch set 2 that integrates David's comments. Note that
this uses the combined strlen (i.e. encoding compressed and
uncompressed strlen into one gcov_unsigned_t).

Testing is ongoing.

-Rong

On Tue, Oct 6, 2015 at 11:30 AM, Rong Xu <x...@google.com> wrote:
> It's 1:3 to 1:4 in the programs I tested. But it really depends on how
> the options are used. I think your idea of using combined strlen works
> better.
> I just make the code a little clumsy but it does not cause any
> performance issue.
>
> On Tue, Oct 6, 2015 at 10:21 AM, Xinliang David Li <davi...@google.com> wrote:
>> On Tue, Oct 6, 2015 at 9:26 AM, Rong Xu <x...@google.com> wrote:
>>> On Mon, Oct 5, 2015 at 5:33 PM, Xinliang David Li <davi...@google.com> 
>>> wrote:
>>>>unsigned ggc_memory = gcov_read_unsigned ();
>>>> +  unsigned marker = 0, len = 0, k;
>>>> +  char **string_array, *saved_cc1_strings;
>>>> +
>>>>for (unsigned j = 0; j < 7; j++)
>>>>
>>>>
>>>> Do not use hard coded number. Use the enum defined in coverage.c.
>>>
>>> OK.
>>>
>>>>
>>>>
>>>> +string_array[j] = xstrdup (gcov_read_string ());
>>>> +
>>>> +  k = 0;
>>>> +  for (unsigned j = 1; j < 7; j++)
>>>>
>>>> Do not use hard coded number.
>>>
>>> OK.
>>>
>>>>
>>>>
>>>> +{
>>>> +  if (num_array[j] == 0)
>>>> +continue;
>>>> +  marker += num_array[j];
>>>>
>>>> It is better to read if the name of variable 'marker' is changed to
>>>> 'j_end' or something similar
>>>>
>>>> For all the substrings of 'j' kind, there should be just one marker,
>>>> right? It looks like here you introduce one marker per string, not one
>>>> marker per string kind.
>>>
>>> I don't understand what you meant here. "marker" is fixed for each j
>>> substring (one option kind) -- it the end index of the sub-string
>>> array. k-loop is for each string.
>>>
>>
>> That was a wrong comment from me. Discard it.
>>
>>>>
>>>> +  len += 3; /* [[  */
>>>>
>>>> Same here for hard coded value.
>>>>
>>>> +  for (; k < marker; k++)
>>>> +len += strlen (string_array[k]) + 1; /* 1 for delimter of ']' 
>>>>  */
>>>>
>>>> Why do we need one ']' per string?
>>>
>>> This is because the options strings can contain space '  '. I cannot
>>> use space as the delimiter, neither is \0 as it is the end of the
>>> string of the encoded string.
>>
>> Ok -- this allows you to avoid string copy during parsing.
>>>
>>>>
>>>>
>>>> +}
>>>> +  saved_cc1_strings = (char *) xmalloc (len + 1);
>>>> +  saved_cc1_strings[0] = 0;
>>>> +
>>>> +  marker = 0;
>>>> +  k = 0;
>>>> +  for (unsigned j = 1; j < 7; j++)
>>>>
>>>> Same here for 7.
>>>
>>> will fix in the new patch.
>>>
>>>>
>>>> +{
>>>> +  static const char lipo_string_flags[6] = {'Q', 'B', 'S',
>>>> 'D','I', 'C'};
>>>> +  if (num_array[j] == 0)
>>>> +continue;
>>>> +  marker += num_array[j];
>>>>
>>>> Suggest changing marker to j_end
>>> OK.
>>>
>>>>
>>>> +  sprintf (saved_cc1_strings, "%s[[%c", saved_cc1_strings,
>>>> +   lipo_string_flags[j - 1]);
>>>> +  for (; k < marker; k++)
>>>> +{
>>>> +  sprintf (saved_cc1_strings, "%s%s]", saved_cc1_strings,
>>>> +   string_array[k]);
>>>>
>>>> +#define DELIMTER"[["
>>>>
>>>> Why double '[' ?
>>> I will change to single '['.
>>>
>>>>
>>>> +#define DELIMTER2   "]"
>>>> +#define QUOTE_PATH_FLAG 'Q'
>>>> +#define BRACKET_PATH_FLAG   'B'
>>>> +#define SYSTEM_PATH_FLAG'S'
>>>> +#define D_U_OPTION_FLAG 'D'
>>>> +#define INCLUDE_OPTION_FLAG 'I'
>>>>

Re: [google][gcc-4_9] encode and compress cc1 option strings in gcov_module_info

2015-10-06 Thread Rong Xu
On Mon, Oct 5, 2015 at 5:33 PM, Xinliang David Li <davi...@google.com> wrote:
>unsigned ggc_memory = gcov_read_unsigned ();
> +  unsigned marker = 0, len = 0, k;
> +  char **string_array, *saved_cc1_strings;
> +
>for (unsigned j = 0; j < 7; j++)
>
>
> Do not use hard coded number. Use the enum defined in coverage.c.

OK.

>
>
> +string_array[j] = xstrdup (gcov_read_string ());
> +
> +  k = 0;
> +  for (unsigned j = 1; j < 7; j++)
>
> Do not use hard coded number.

OK.

>
>
> +{
> +  if (num_array[j] == 0)
> +continue;
> +  marker += num_array[j];
>
> It is better to read if the name of variable 'marker' is changed to
> 'j_end' or something similar
>
> For all the substrings of 'j' kind, there should be just one marker,
> right? It looks like here you introduce one marker per string, not one
> marker per string kind.

I don't understand what you meant here. "marker" is fixed for each j
substring (one option kind) -- it the end index of the sub-string
array. k-loop is for each string.

>
> +  len += 3; /* [[  */
>
> Same here for hard coded value.
>
> +  for (; k < marker; k++)
> +len += strlen (string_array[k]) + 1; /* 1 for delimter of ']'  */
>
> Why do we need one ']' per string?

This is because the options strings can contain space '  '. I cannot
use space as the delimiter, neither is \0 as it is the end of the
string of the encoded string.

>
>
> +}
> +  saved_cc1_strings = (char *) xmalloc (len + 1);
> +  saved_cc1_strings[0] = 0;
> +
> +  marker = 0;
> +  k = 0;
> +  for (unsigned j = 1; j < 7; j++)
>
> Same here for 7.

will fix in the new patch.

>
> +{
> +  static const char lipo_string_flags[6] = {'Q', 'B', 'S',
> 'D','I', 'C'};
> +  if (num_array[j] == 0)
> +continue;
> +  marker += num_array[j];
>
> Suggest changing marker to j_end
OK.

>
> +  sprintf (saved_cc1_strings, "%s[[%c", saved_cc1_strings,
> +   lipo_string_flags[j - 1]);
> +  for (; k < marker; k++)
> +{
> +  sprintf (saved_cc1_strings, "%s%s]", saved_cc1_strings,
> +   string_array[k]);
>
> +#define DELIMTER"[["
>
> Why double '[' ?
I will change to single '['.

>
> +#define DELIMTER2   "]"
> +#define QUOTE_PATH_FLAG 'Q'
> +#define BRACKET_PATH_FLAG   'B'
> +#define SYSTEM_PATH_FLAG'S'
> +#define D_U_OPTION_FLAG 'D'
> +#define INCLUDE_OPTION_FLAG 'I'
> +#define COMMAND_ARG_FLAG'C'
> +
> +enum lipo_cc1_string_kind {
> +  k_quote_paths = 0,
> +  k_bracket_paths,
> +  k_system_paths,
> +  k_cpp_defines,
> +  k_cpp_includes,
> +  k_lipo_cl_args,
> +  num_lipo_cc1_string_kind
> +};
> +
> +struct lipo_parsed_cc1_string {
> +  const char* source_filename;
> +  unsigned num[num_lipo_cc1_string_kind];
> +  char **strings[num_lipo_cc1_string_kind];
> +};
> +
> +struct lipo_parsed_cc1_string *
> +lipo_parse_saved_cc1_string (const char *src, char *str,
> +bool parse_cl_args_only);
> +void free_parsed_string (struct lipo_parsed_cc1_string *string);
> +
>
> Declare above in a header file.

OK.

>
>
>  /* Returns true if the command-line arguments stored in the given 
> module-infos
> are incompatible.  */
>  bool
> -incompatible_cl_args (struct gcov_module_info* mod_info1,
> -  struct gcov_module_info* mod_info2)
> +incompatible_cl_args (struct lipo_parsed_cc1_string* mod_info1,
> +  struct lipo_parsed_cc1_string* mod_info2)
>
> Fix formating.
OK.
>
>  {
>  {
> @@ -1647,7 +1679,7 @@ build_var (tree fn_decl, tree type, int counter)
>  /* Creates the gcov_fn_info RECORD_TYPE.  */
>
>NULL_TREE, get_gcov_unsigned_t ());
>DECL_CHAIN (field) = fields;
>fields = field;
>
> -  /* Num bracket paths  */
> +  /* cc1_uncompressed_strlen field */
>field = build_decl (BUILTINS_LOCATION, FIELD_DECL,
>NULL_TREE, get_gcov_unsigned_t ());
>DECL_CHAIN (field) = fields;
>fields = field;
>
>
> Why do we need to store uncompressed string length? If there is need
> to do that, I suggest combine uncompressed length and compressed
> length into one 32bit integer. (16/16, or 17/15 split)

In theory, I don't need the uncompressed length, But I would need to
guess the uncompressed length to allocate the buffer. If the
decompressing fails with insufficient space, I need to have another
guess and do it again.  I think it's 

[google][gcc-4_9] encode and compress cc1 option strings in gcov_module_info

2015-10-05 Thread Rong Xu
Hi,

This patch is for google branch only.

It encodes and compresses various cc1 option strings in
gcov_module_info to reduce the lipo instrumented object size. The
savings are from string compression and the reduced number of
relocations.

More specifically, we replace the following fields in gcov_module_info
  gcov_unsigned_t num_quote_paths;
  gcov_unsigned_t num_bracket_paths;
  gcov_unsigned_t num_system_paths;
  gcov_unsigned_t num_cpp_defines;
  gcov_unsigned_t num_cpp_includes;
  gcov_unsigned_t num_cl_args;
  char *string_array[1];
with
  gcov_unsigned_t cc1_strlen;
  gcov_unsigned_t cc1_uncompressed_strlen;
  char *saved_cc1_strings;

The new saved_cc1_strings are zlib compressed string.

Tested with google internal benchmarks.

Thanks,

-Rong
2015-10-05  Rong Xu  <x...@google.com>

* gcc/Makefile.in (gcov-dump): link with zlib
(gcov-tool): Ditto.
* gcc/auto-profile.c (autofdo_module_profile::read): convert old
gcov_module_info to the new format.
(read_aux_modules): using new incompatible_cl_args interface.
* gcc/coverage.c (zlib.h): new include.
(enum lipo_cc1_string_kind): new define.
(struct lipo_parsed_cc1_string): new define.
(incompatible_cl_args): change the paramter.
(read_counts_file): using new gcov_module_info.
(build_fn_info_type): remove used parameter.
(build_gcov_module_info_type): using new gcov_module_info.
(free_parsed_string): new function.
(find_substr): ditto.
(lipo_parse_saved_cc1_string): ditto.
(lipo_append_tag): ditto.
(build_gcov_module_info_value): using new gcov_module_info.
(coverage_obj_init): remove used parameter.
(add_module_info): using new gcov_module_info.
(process_include): ditto.
(process_include_paths_1): ditto.
(process_include_paths): ditto.
(set_lipo_c_parsing_context): ditto.
* gcc/coverage.h: add new decls.
* gcc/gcov-io.c (zlib.h) new include.
(gcov_read_module_info): using new gcov_module_info.
* gcc/gcov-io.h (struct gcov_module_info): new gcov_module_info.
* libgcc/dyn-ipa.c: delete unused code.
* libgcc/libgcov-driver.c (gcov_write_module_info): using new
gcov_module_info.

Index: gcc/Makefile.in
===
--- gcc/Makefile.in (revision 228354)
+++ gcc/Makefile.in (working copy)
@@ -2583,7 +2583,7 @@ gcov$(exeext): $(GCOV_OBJS) $(LIBDEPS)
 GCOV_DUMP_OBJS = gcov-dump.o vec.o ggc-none.o
 gcov-dump$(exeext): $(GCOV_DUMP_OBJS) $(LIBDEPS)
+$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) $(GCOV_DUMP_OBJS) \
-   $(LIBS) -o $@
+   $(LIBS)  $(ZLIB) -o $@
 
 GCOV_TOOL_DEP_FILES = $(srcdir)/../libgcc/libgcov-util.c gcov-io.c 
$(GCOV_IO_H) \
   $(srcdir)/../libgcc/libgcov-driver.c 
$(srcdir)/../libgcc/libgcov-driver-system.c \
@@ -2607,7 +2607,7 @@ gcov-tool-params.o: params.c $(CONFIG_H) $(SYSTEM_
 GCOV_TOOL_OBJS = gcov-tool.o libgcov-util.o libgcov-driver-tool.o \
 libgcov-merge-tool.o gcov-tool-dyn-ipa.o gcov-tool-params.o
 gcov-tool$(exeext): $(GCOV_TOOL_OBJS) $(LIBDEPS)
-   +$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) $(GCOV_TOOL_OBJS) $(LIBS) -o $@
+   +$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) $(GCOV_TOOL_OBJS) $(LIBS)  
$(ZLIB) -o $@
 #
 # Build the include directories.  The stamp files are stmp-* rather than
 # s-* so that mostlyclean does not force the include directory to
Index: gcc/auto-profile.c
===
--- gcc/auto-profile.c  (revision 228354)
+++ gcc/auto-profile.c  (working copy)
@@ -947,7 +947,6 @@ autofdo_source_profile::get_function_instance_by_i
   return s;
 }
 
-
 /* Member functions for autofdo_module_profile.  */
 
 bool
@@ -972,6 +971,9 @@ autofdo_module_profile::read ()
   unsigned exported = gcov_read_unsigned ();
   unsigned lang = gcov_read_unsigned ();
   unsigned ggc_memory = gcov_read_unsigned ();
+  unsigned marker = 0, len = 0, k;
+  char **string_array, *saved_cc1_strings;
+
   for (unsigned j = 0; j < 7; j++)
 {
   num_array[j] = gcov_read_unsigned ();
@@ -988,19 +990,48 @@ autofdo_module_profile::read ()
   module->ident = i + 1;
   module->lang = lang;
   module->ggc_memory = ggc_memory;
-  module->num_quote_paths = num_array[1];
-  module->num_bracket_paths = num_array[2];
-  module->num_system_paths = num_array[3];
-  module->num_cpp_defines = num_array[4];
-  module->num_cpp_includes = num_array[5];
-  module->num_cl_args = num_array[6];
   module->source_filename = name;
   module->is_primary = strcmp (name, in_fnames[0]) == 0;
   module->flags = module->is_primary ? exported : 1;
   for (unsigned j = 0; j < num_array[0]; j++)
 ret.first->second.first.safe_push 

[google][gcc-4_9] Remove unused key field in gcov_fn_info

2015-09-29 Thread Rong Xu
Hi,

This patch is for google/gcc-4_9 branch.

The 'key' field in gcov_fn_info is designed to allow gcov function
data to be COMDATTed, but the comdat elimination never works. This
patch removes this field to reduce the instrumented object size.

Thanks,

-Rong
Removed the unused 'key' field in gcov_fn_info to reduce the 
instrumented objects size.

2015-09-29  Rong Xu  <x...@google.com>

* gcc/coverage.c (build_fn_info_type): Remove 'key'
field. (build_fn_info): Ditto.
(coverage_obj_fn): Ditto.
* libgcc/libgcov.h (struct gcov_fn_info): Ditto.
* libgcc/libgcov-driver.c (gcov_compute_histogram): Ditto.
(gcov_exit_compute_summary): Ditto.
(gcov_exit_merge_gcda): Ditto.
(gcov_write_func_counters): Ditto.
(gcov_clear): Ditto.
* libgcc/libgcov-util.c (tag_function): Ditto.
(gcov_merge): Ditto.
(gcov_profile_scale): Ditto.
(gcov_profile_normalize): Ditto.
(compute_one_gcov): Ditto.
(gcov_info_count_all_cold): Ditto.

Index: gcc/coverage.c
===
--- gcc/coverage.c  (revision 228223)
+++ gcc/coverage.c  (working copy)
@@ -189,7 +189,7 @@ static void read_counts_file (const char *, unsign
 static tree build_var (tree, tree, int);
 static void build_fn_info_type (tree, unsigned, tree);
 static void build_info_type (tree, tree);
-static tree build_fn_info (const struct coverage_data *, tree, tree);
+static tree build_fn_info (const struct coverage_data *, tree);
 static tree build_info (tree, tree);
 static bool coverage_obj_init (void);
 static vec<constructor_elt, va_gc> *coverage_obj_fn
@@ -1668,16 +1668,9 @@ build_fn_info_type (tree type, unsigned counters,
 
   finish_builtin_struct (ctr_info, "__gcov_ctr_info", fields, NULL_TREE);
 
-  /* key */
-  field = build_decl (BUILTINS_LOCATION, FIELD_DECL, NULL_TREE,
- build_pointer_type (build_qualified_type
- (gcov_info_type, TYPE_QUAL_CONST)));
-  fields = field;
-
   /* ident */
   field = build_decl (BUILTINS_LOCATION, FIELD_DECL, NULL_TREE,
  get_gcov_unsigned_t ());
-  DECL_CHAIN (field) = fields;
   fields = field;
 
   /* lineno_checksum */
@@ -1705,10 +1698,10 @@ build_fn_info_type (tree type, unsigned counters,
 
 /* Returns a CONSTRUCTOR for a gcov_fn_info.  DATA is
the coverage data for the function and TYPE is the gcov_fn_info
-   RECORD_TYPE.  KEY is the object file key.  */
+   RECORD_TYPE.  */
 
 static tree
-build_fn_info (const struct coverage_data *data, tree type, tree key)
+build_fn_info (const struct coverage_data *data, tree type)
 {
   tree fields = TYPE_FIELDS (type);
   tree ctr_type;
@@ -1716,11 +1709,6 @@ static tree
   vec<constructor_elt, va_gc> *v1 = NULL;
   vec<constructor_elt, va_gc> *v2 = NULL;
 
-  /* key */
-  CONSTRUCTOR_APPEND_ELT (v1, fields,
- build1 (ADDR_EXPR, TREE_TYPE (fields), key));
-  fields = DECL_CHAIN (fields);
-  
   /* ident */
   CONSTRUCTOR_APPEND_ELT (v1, fields,
  build_int_cstu (get_gcov_unsigned_t (),
@@ -2556,7 +2544,7 @@ static vec<constructor_elt, va_gc> *
 coverage_obj_fn (vec<constructor_elt, va_gc> *ctor, tree fn,
 struct coverage_data const *data)
 {
-  tree init = build_fn_info (data, gcov_fn_info_type, gcov_info_var);
+  tree init = build_fn_info (data, gcov_fn_info_type);
   tree var = build_var (fn, gcov_fn_info_type, -1);
   
   DECL_INITIAL (var) = init;
Index: libgcc/libgcov-driver.c
===
--- libgcc/libgcov-driver.c (revision 227984)
+++ libgcc/libgcov-driver.c (working copy)
@@ -380,7 +380,7 @@ gcov_compute_histogram (struct gcov_summary *sum)
 {
   gfi_ptr = gi_ptr->functions[f_ix];
 
-  if (!gfi_ptr || gfi_ptr->key != gi_ptr)
+  if (!gfi_ptr)
 continue;
 
   ci_ptr = _ptr->ctrs[ctr_info_ix];
@@ -430,9 +430,6 @@ gcov_exit_compute_summary (struct gcov_summary *th
 {
   gfi_ptr = gi_ptr->functions[f_ix];
 
-  if (gfi_ptr && gfi_ptr->key != gi_ptr)
-gfi_ptr = 0;
-
   crc32 = crc32_unsigned (crc32, gfi_ptr ? gfi_ptr->cfg_checksum : 0);
   crc32 = crc32_unsigned (crc32,
   gfi_ptr ? gfi_ptr->lineno_checksum : 0);
@@ -688,7 +685,7 @@ gcov_exit_merge_gcda (struct gcov_info *gi_ptr,
   if (length != GCOV_TAG_FUNCTION_LENGTH)
 goto read_mismatch;
 
-  if (!gfi_ptr || gfi_ptr->key != gi_ptr)
+  if (!gfi_ptr)
 {
   /* This function appears in the other program.  We
  need to buffer the information in order to write
@@ -832,10 +829,8 @@ gcov_write_func_counters (struct gcov_info *gi_ptr
   else
 {
   gfi_p

Re: [google][gcc-4_9] Remove unused key field in gcov_fn_info

2015-09-29 Thread Rong Xu
You are right. I attached the updated patch to this email.

On Tue, Sep 29, 2015 at 3:10 PM, Xinliang David Li <davi...@google.com> wrote:
>else
>  {
>gfi_ptr = gi_ptr->functions[f_ix];
> -  if (gfi_ptr && gfi_ptr->key == gi_ptr)
> +  if (gfi_ptr)
>  length = GCOV_TAG_FUNCTION_LENGTH;
> -  else
> -length = 0;
>  }
>
> The removal of 'else' path seems wrong.
>
> David
>
>
> On Tue, Sep 29, 2015 at 1:46 PM, Rong Xu <x...@google.com> wrote:
>> Hi,
>>
>> This patch is for google/gcc-4_9 branch.
>>
>> The 'key' field in gcov_fn_info is designed to allow gcov function
>> data to be COMDATTed, but the comdat elimination never works. This
>> patch removes this field to reduce the instrumented object size.
>>
>> Thanks,
>>
>> -Rong
Removed the unused 'key' field in gcov_fn_info to reduce the 
instrumented objects size.

2015-09-29  Rong Xu  <x...@google.com>

* gcc/coverage.c (build_fn_info_type): Remove 'key'
field. (build_fn_info): Ditto.
(coverage_obj_fn): Ditto.
* libgcc/libgcov.h (struct gcov_fn_info): Ditto.
* libgcc/libgcov-driver.c (gcov_compute_histogram): Ditto.
(gcov_exit_compute_summary): Ditto.
(gcov_exit_merge_gcda): Ditto.
(gcov_write_func_counters): Ditto.
(gcov_clear): Ditto.
* libgcc/libgcov-util.c (tag_function): Ditto.
(gcov_merge): Ditto.
(gcov_profile_scale): Ditto.
(gcov_profile_normalize): Ditto.
(compute_one_gcov): Ditto.
(gcov_info_count_all_cold): Ditto.

Index: gcc/coverage.c
===
--- gcc/coverage.c  (revision 228223)
+++ gcc/coverage.c  (working copy)
@@ -189,7 +189,7 @@ static void read_counts_file (const char *, unsign
 static tree build_var (tree, tree, int);
 static void build_fn_info_type (tree, unsigned, tree);
 static void build_info_type (tree, tree);
-static tree build_fn_info (const struct coverage_data *, tree, tree);
+static tree build_fn_info (const struct coverage_data *, tree);
 static tree build_info (tree, tree);
 static bool coverage_obj_init (void);
 static vec<constructor_elt, va_gc> *coverage_obj_fn
@@ -1668,16 +1668,9 @@ build_fn_info_type (tree type, unsigned counters,
 
   finish_builtin_struct (ctr_info, "__gcov_ctr_info", fields, NULL_TREE);
 
-  /* key */
-  field = build_decl (BUILTINS_LOCATION, FIELD_DECL, NULL_TREE,
- build_pointer_type (build_qualified_type
- (gcov_info_type, TYPE_QUAL_CONST)));
-  fields = field;
-
   /* ident */
   field = build_decl (BUILTINS_LOCATION, FIELD_DECL, NULL_TREE,
  get_gcov_unsigned_t ());
-  DECL_CHAIN (field) = fields;
   fields = field;
 
   /* lineno_checksum */
@@ -1705,10 +1698,10 @@ build_fn_info_type (tree type, unsigned counters,
 
 /* Returns a CONSTRUCTOR for a gcov_fn_info.  DATA is
the coverage data for the function and TYPE is the gcov_fn_info
-   RECORD_TYPE.  KEY is the object file key.  */
+   RECORD_TYPE.  */
 
 static tree
-build_fn_info (const struct coverage_data *data, tree type, tree key)
+build_fn_info (const struct coverage_data *data, tree type)
 {
   tree fields = TYPE_FIELDS (type);
   tree ctr_type;
@@ -1716,11 +1709,6 @@ static tree
   vec<constructor_elt, va_gc> *v1 = NULL;
   vec<constructor_elt, va_gc> *v2 = NULL;
 
-  /* key */
-  CONSTRUCTOR_APPEND_ELT (v1, fields,
- build1 (ADDR_EXPR, TREE_TYPE (fields), key));
-  fields = DECL_CHAIN (fields);
-  
   /* ident */
   CONSTRUCTOR_APPEND_ELT (v1, fields,
  build_int_cstu (get_gcov_unsigned_t (),
@@ -2556,7 +2544,7 @@ static vec<constructor_elt, va_gc> *
 coverage_obj_fn (vec<constructor_elt, va_gc> *ctor, tree fn,
 struct coverage_data const *data)
 {
-  tree init = build_fn_info (data, gcov_fn_info_type, gcov_info_var);
+  tree init = build_fn_info (data, gcov_fn_info_type);
   tree var = build_var (fn, gcov_fn_info_type, -1);
   
   DECL_INITIAL (var) = init;
Index: libgcc/libgcov-driver.c
===
--- libgcc/libgcov-driver.c (revision 227984)
+++ libgcc/libgcov-driver.c (working copy)
@@ -380,7 +380,7 @@ gcov_compute_histogram (struct gcov_summary *sum)
 {
   gfi_ptr = gi_ptr->functions[f_ix];
 
-  if (!gfi_ptr || gfi_ptr->key != gi_ptr)
+  if (!gfi_ptr)
 continue;
 
   ci_ptr = _ptr->ctrs[ctr_info_ix];
@@ -430,9 +430,6 @@ gcov_exit_compute_summary (struct gcov_summary *th
 {
   gfi_ptr = gi_ptr->functions[f_ix];
 
-  if (gfi_ptr && gfi_ptr->key != gi_ptr)
-g

[PATCH] multi-target indirect-call promotion

2014-11-17 Thread Rong Xu
Hi,

This patch implements the use phase of indirect-call-topn-profile that
promotes multiple targets of indirect-calls. It addresses pr/45631.

The main idea is to use current speculation framework, with a vector
of direct edges (sorted by the probability).  The trick part is the we
have multiple edges associating withe same call_stmt.
Resolve_speculation and edirect_call_stmt_to_callee are more complex
now.
This patch still has some issues. It runs into an assertion
verify_cgraph_node assertion() when ruing 483.xalancbmk in spec2006.

I think Honza should be able to quickly spot the issue and give
valuable suggestions.

Thanks,

-Rong


indirect_call_topn_profile_use_patch
Description: Binary data


[google gcc-4.9] FDO build for linux kernel

2014-11-13 Thread Rong Xu
Here is patch that ports our work on FDO linux kernel build support to
gcc-4_9. With this patch, kernel will use the libgcov functions to
dump the gcda files.

This patch also enables LIPO build. But the module grouping is not
computed online. We will use gcov-tool to do this offline.

Tested with kernel FDO build, regression and google internal benchmarks.

Thanks,

-Rong


kfdo_patch
Description: Binary data


[google gcc-4_9] make ifunc support available for BIONIC

2014-10-27 Thread Rong Xu
ifunc support is hard-coded as false for BIONIC. This patch removes
this check and let
configure decide whether it should have ifunc support.

Thanks,

-Rong


ifunc_diff
Description: Binary data


[PATCH PR63581] Fix undefined references in debug_info

2014-10-17 Thread Rong Xu
Hi,

The attached patch fixes PR63581. The diagnosis is in the bug report.
Google ref b/17759776.

Tested with bootstrap and regression.

Thanks,

-Rong


63581_patch
Description: Binary data


Re: [PATCH PR63581] Fix undefined references in debug_info

2014-10-17 Thread Rong Xu
Here is the diff for ChangeLog

Index: ChangeLog
===
--- ChangeLog (revision 216415)
+++ ChangeLog (working copy)
@@ -1,3 +1,8 @@
+2014-10-17x...@google.com
+
+ * cfgrtl.c (emit_barrier_after_bb): Append the barrier to the
+ footer, instead of unconditionally overwritten.
+
 2014-10-17  Eric Botcazou  ebotca...@adacore.com

  * ipa-inline-transform.c (master_clone_with_noninline_clones_p): New.
Index: testsuite/ChangeLog
===
--- testsuite/ChangeLog (revision 216415)
+++ testsuite/ChangeLog (working copy)
@@ -1,3 +1,6 @@
+2014-10-17x...@google.com
+ * g++.dg/tree-prof/pr63581.C: New test.
+
 2014-10-17  Marek Polacek  pola...@redhat.com

  PR c/63543



On Fri, Oct 17, 2014 at 2:54 PM, Rong Xu x...@google.com wrote:
 Hi,

 The attached patch fixes PR63581. The diagnosis is in the bug report.
 Google ref b/17759776.

 Tested with bootstrap and regression.

 Thanks,

 -Rong


Re: [PATCH] add overlap function to gcov-tool

2014-10-08 Thread Rong Xu
On Tue, Oct 7, 2014 at 9:31 PM, Jan Hubicka hubi...@ucw.cz wrote:
 Hi,

 This patch adds overlap functionality to gcov-tool. The overlap score
 estimates the similarity of two profiles. Currently it only computes
 overlap for arc counters.

 The overlap score is defined as
 \sum minimum (p1-counter[i] / p1-sum-all, p2-counter[i] / p2-sum-all)
 where p1-counter[i] and p2-counter[2] are two matched counter from
 profile1 and profiler2.
 p1-sum-all and p2-sum-all are the sum-all counters in profiler1 and
 profile2, repetitively.

 The patch looks fine in general.  My statistics is all rusty, but can't we use
 one of the established techniques like Kullback-Leibler to compare the
 probabilitis distributions?

Interesting. I never thought of using Kullback-Leibler divergence.
It's very easy to switch to KL using this overlap framework -- only a
few lines of change.
The problem is KL divergence assumes absolute continuity (i.e. q(i)
==0 -- p(i) =0, which is
not true in our distribution, I'm not sure how to work around this.).

I did try earth-mover-distance (EMD) in our earlier internal version.
But since I used
uniform distance, the problem can be simplified to distribution diffs.

 It would be also nice to have ability to compare
 branch probabilities in btween train runs.

Do you mean to do the comparison in CFG rather on the raw counters?
We need gcno file to reconstruct the CFG. That needs some work.

-Rong


 Honza

 The resulting score is a value ranging from 0.0 to 1.0 where 0.0 means
 no match and 1.0 mean a perfect match.

 This tool can be used in performance triaging and reducing the fdo
 training set size (where similar inputs can be pruned).

 Tested with spec2006 profiles.

 Thanks,

 -Rong

 2014-10-07  Rong Xu  x...@google.com

   * gcc/gcov-tool.c (profile_overlap): New driver function
 to compute profile overlap.
   (print_overlap_usage_message): New.
   (overlap_usage): New.
   (do_overlap): New.
   (print_usage): Add calls to overlap function.
   (main): Ditto.
   * libgcc/libgcov-util.c (read_gcda_file): Fix format.
   (find_match_gcov_info): Ditto.
   (calculate_2_entries): New.
   (compute_one_gcov): Ditto.
   (gcov_info_count_all_cold): Ditto.
   (gcov_info_count_all_zero): Ditto.
   (extract_file_basename): Ditto.
   (get_file_basename): Ditto.
   (set_flag): Ditto.
   (matched_gcov_info): Ditto.
   (calculate_overlap): Ditto.
   (gcov_profile_overlap): Ditto.
   * libgcc/libgcov-driver.c (compute_summary): Make
 it avavilable for external calls.
   * gcc/doc/gcov-tool.texi: Add documentation.

 Index: gcc/gcov-tool.c
 ===
 --- gcc/gcov-tool.c   (revision 215981)
 +++ gcc/gcov-tool.c   (working copy)
 @@ -39,6 +39,7 @@ see the files COPYING3 and COPYING.RUNTIME respect
  #include getopt.h

  extern int gcov_profile_merge (struct gcov_info*, struct gcov_info*, int, 
 int);
 +extern int gcov_profile_overlap (struct gcov_info*, struct gcov_info*);
  extern int gcov_profile_normalize (struct gcov_info*, gcov_type);
  extern int gcov_profile_scale (struct gcov_info*, float, int, int);
  extern struct gcov_info* gcov_read_profile_dir (const char*, int);
 @@ -368,6 +369,121 @@ do_rewrite (int argc, char **argv)
return ret;
  }

 +/* Driver function to computer the overlap score b/w profile D1 and D2.
 +   Return 1 on error and 0 if OK.  */
 +
 +static int
 +profile_overlap (const char *d1, const char *d2)
 +{
 +  struct gcov_info *d1_profile;
 +  struct gcov_info *d2_profile;
 +
 +  d1_profile = gcov_read_profile_dir (d1, 0);
 +  if (!d1_profile)
 +return 1;
 +
 +  if (d2)
 +{
 +  d2_profile = gcov_read_profile_dir (d2, 0);
 +  if (!d2_profile)
 +return 1;
 +
 +  return gcov_profile_overlap (d1_profile, d2_profile);
 +}
 +
 +  return 1;
 +}
 +
 +/* Usage message for profile overlap.  */
 +
 +static void
 +print_overlap_usage_message (int error_p)
 +{
 +  FILE *file = error_p ? stderr : stdout;
 +
 +  fnotice (file,   overlap [options] dir1 dir2   Compute the 
 overlap of two profiles\n);
 +  fnotice (file, -v, --verbose   Verbose mode\n);
 +  fnotice (file, -h, --hotonly   Only print info 
 for hot objects/functions\n);
 +  fnotice (file, -f, --function  Print function 
 level info\n);
 +  fnotice (file, -F, --fullname  Print full 
 filename\n);
 +  fnotice (file, -o, --objectPrint object 
 level info\n);
 +  fnotice (file, -t float, --hot_threshold float Set the threshold 
 for hotness\n);
 +
 +}
 +
 +static const struct option overlap_options[] =
 +{
 +  { verbose,no_argument,   NULL, 'v' },
 +  { function,   no_argument,   NULL, 'f' },
 +  { fullname,   no_argument,   NULL, 'F' },
 +  { object

[PATCH] move gcov-dump functionality to gcov-tool

2014-10-08 Thread Rong Xu
Hi

This patch moves the gcov-dump functionality to gcov-tool (which is
installed by default).
The options are exactly the same as before. The difference is instead of calling
gcov-dump ..., we now call gcov-tool dump ...

gcov-dump is useful in debugging fdo issues. I think it would be very
helpful to make it available with the compiler.

gcov-dump.c and the build construct are deleted with this patch.

Tested with bootstrap and some spec2006 profiles.

Thanks,

-Rong


gcov_tool_dump_patch
Description: Binary data


[PATCH] add overlap function to gcov-tool

2014-10-07 Thread Rong Xu
Hi,

This patch adds overlap functionality to gcov-tool. The overlap score
estimates the similarity of two profiles. Currently it only computes
overlap for arc counters.

The overlap score is defined as
\sum minimum (p1-counter[i] / p1-sum-all, p2-counter[i] / p2-sum-all)
where p1-counter[i] and p2-counter[2] are two matched counter from
profile1 and profiler2.
p1-sum-all and p2-sum-all are the sum-all counters in profiler1 and
profile2, repetitively.

The resulting score is a value ranging from 0.0 to 1.0 where 0.0 means
no match and 1.0 mean a perfect match.

This tool can be used in performance triaging and reducing the fdo
training set size (where similar inputs can be pruned).

Tested with spec2006 profiles.

Thanks,

-Rong
2014-10-07  Rong Xu  x...@google.com

* gcc/gcov-tool.c (profile_overlap): New driver function
to compute profile overlap. 
(print_overlap_usage_message): New.
(overlap_usage): New.
(do_overlap): New.
(print_usage): Add calls to overlap function.
(main): Ditto.
* libgcc/libgcov-util.c (read_gcda_file): Fix format.
(find_match_gcov_info): Ditto.
(calculate_2_entries): New.
(compute_one_gcov): Ditto.
(gcov_info_count_all_cold): Ditto.
(gcov_info_count_all_zero): Ditto.
(extract_file_basename): Ditto.
(get_file_basename): Ditto.
(set_flag): Ditto.
(matched_gcov_info): Ditto.
(calculate_overlap): Ditto.
(gcov_profile_overlap): Ditto.
* libgcc/libgcov-driver.c (compute_summary): Make
it avavilable for external calls.
* gcc/doc/gcov-tool.texi: Add documentation.

Index: gcc/gcov-tool.c
===
--- gcc/gcov-tool.c (revision 215981)
+++ gcc/gcov-tool.c (working copy)
@@ -39,6 +39,7 @@ see the files COPYING3 and COPYING.RUNTIME respect
 #include getopt.h
 
 extern int gcov_profile_merge (struct gcov_info*, struct gcov_info*, int, int);
+extern int gcov_profile_overlap (struct gcov_info*, struct gcov_info*);
 extern int gcov_profile_normalize (struct gcov_info*, gcov_type);
 extern int gcov_profile_scale (struct gcov_info*, float, int, int);
 extern struct gcov_info* gcov_read_profile_dir (const char*, int);
@@ -368,6 +369,121 @@ do_rewrite (int argc, char **argv)
   return ret;
 }
 
+/* Driver function to computer the overlap score b/w profile D1 and D2.
+   Return 1 on error and 0 if OK.  */
+
+static int
+profile_overlap (const char *d1, const char *d2)
+{
+  struct gcov_info *d1_profile;
+  struct gcov_info *d2_profile;
+
+  d1_profile = gcov_read_profile_dir (d1, 0);
+  if (!d1_profile)
+return 1;
+
+  if (d2)
+{
+  d2_profile = gcov_read_profile_dir (d2, 0);
+  if (!d2_profile)
+return 1;
+
+  return gcov_profile_overlap (d1_profile, d2_profile);
+}
+
+  return 1;
+}
+
+/* Usage message for profile overlap.  */
+
+static void
+print_overlap_usage_message (int error_p)
+{
+  FILE *file = error_p ? stderr : stdout;
+
+  fnotice (file,   overlap [options] dir1 dir2   Compute the overlap 
of two profiles\n);
+  fnotice (file, -v, --verbose   Verbose mode\n);
+  fnotice (file, -h, --hotonly   Only print info for 
hot objects/functions\n);
+  fnotice (file, -f, --function  Print function level 
info\n);
+  fnotice (file, -F, --fullname  Print full 
filename\n);
+  fnotice (file, -o, --objectPrint object level 
info\n);
+  fnotice (file, -t float, --hot_threshold float Set the threshold 
for hotness\n);
+
+}
+
+static const struct option overlap_options[] =
+{
+  { verbose,no_argument,   NULL, 'v' },
+  { function,   no_argument,   NULL, 'f' },
+  { fullname,   no_argument,   NULL, 'F' },
+  { object, no_argument,   NULL, 'o' },
+  { hotonly,no_argument,   NULL, 'h' },
+  { hot_threshold,  required_argument, NULL, 't' },
+  { 0, 0, 0, 0 }
+};
+
+/* Print overlap usage and exit.  */
+
+static void
+overlap_usage (void)
+{
+  fnotice (stderr, Overlap subcomand usage:);
+  print_overlap_usage_message (true);
+  exit (FATAL_EXIT_CODE);
+}
+
+int overlap_func_level;
+int overlap_obj_level;
+int overlap_hot_only;
+int overlap_use_fullname;
+double overlap_hot_threshold = 0.005;
+
+/* Driver for profile overlap sub-command.  */
+
+static int
+do_overlap (int argc, char **argv)
+{
+  int opt;
+  int ret;
+
+  optind = 0;
+  while ((opt = getopt_long (argc, argv, vfFoht:, overlap_options, NULL)) != 
-1)
+{
+  switch (opt)
+{
+case 'v':
+  verbose = true;
+  gcov_set_verbose ();
+  break;
+case 'f':
+  overlap_func_level = 1;
+  break;
+case 'F':
+  overlap_use_fullname = 1;
+  break;
+case 'o

[PATCH] Indirect-call topn targets profiler (instrumentation)

2014-10-06 Thread Rong Xu
Hi,

I ported the indirect-call topn target profile from google branch. It
implements the algorithm described in Value profiling and
optimization by Brad Calder and Peter Feller. This patch is about the
instrumentation. When --param=indir-call-topn-profile=1 is specified,
we will use indirection-call-topn profiler in this patch, instead of
indirect_call_profiler_v2.

Tested with gcc regression tests and the runtime patch (in a separated patch).

Thanks,

-Rong


patch2_diff
Description: Binary data


[PATCH] Indirect-call topn targets profiler (runtime)

2014-10-06 Thread Rong Xu
Hi,

I ported the indirect-call topn target profile from google branch. It
implements the algorithm described in Value profiling and
optimization by Brad Calder and Peter Feller. This patch is the
runtime support.

Tested with gcc regression tests and the instrumentation patch (in a
separated patch).

Thanks,

-Rong


patch1_diff
Description: Binary data


Re: [PATCH] Indirect-call topn targets profiler (instrumentation)

2014-10-06 Thread Rong Xu
On Mon, Oct 6, 2014 at 12:21 PM, Jan Hubicka hubi...@ucw.cz wrote:
 On Mon, Oct 6, 2014 at 11:22 AM, Jan Hubicka hubi...@ucw.cz wrote:
  Rong,
  Would be possible to use topn profiler to get resonale histograms for 
  switch
  expansion, too?  In that case it may make sense to have value version of 
  it as well.


Yes. That's doable. I'm just not sure if top 2 entries are enough. I
can see from your example that it will be useful for BBs having
multiple case stmt.

 The underlying topn_profiler can be used the same way as the current
 one_value profiler for different scenarios. However, I wonder if it is
 useful for switch value profiling -- the arc profiling can do the
 same.

 Well, for switch expansion one really wants a histogram of values across the 
 switch
 values for the decision tree balancing.  But I guess this profiler does not 
 fit
 that very well.

 Consider cases like

 switch (a)
 {
case 3:
case 5:
case 7:
 prime();
case 2:
case 4:
case 6:
 even();
   default:
 bogus ();
 }

 Arc profiling does tell you how often a is prime or even, but it does not 
 help you
 to balance the tree, because the value ranges are iinterlacing

 David

 
  Otherwise the patch seems OK.  I would implement it myself by introducing 
  separate
  variables holding the topn profiler declaration, but can not think of 
  downsides of
  doing it your way.
 
  The topn profiler should be better on determining the dominating target, 
  so perhaps
  you could look into hooking it into ipa-profile.c.  Extending speculative 
  edges for
  multiple targets ought to be also quite easy - we need to update 
  speculative_call_info
  to collect edges annotated to a given call stmt into a vector instead of 
  taking the
  two references it does now.
 
  It would be also very nice to update it to handle case where all the edges 
  are direct
  so we can use it tospeculatively inlinine functions that can be interposed 
  at runtime.

Yes. I'm looking at this too. I actually ported the fdo-use part of
the patch and it passes spec2006. But it performs the transformation
using the old way (in gimple_ic_transform()) -- so I did not send out
that patch.
I'm trying to do this in the new way (in ipa-profile()). In any case,
that will be a separated patch.

Also, the switch expansion profile should be also a separated patch.

Is it ok to commit these two patches now?

Thanks,

-Rong

 
  Thanks,
  Honza
 
  Index: gcc/tree-profile.c
  ===
  --- gcc/tree-profile.c  (revision 215886)
  +++ gcc/tree-profile.c  (working copy)
  @@ -56,6 +56,7 @@ along with GCC; see the file COPYING3.  If not see
   #include target.h
   #include tree-cfgcleanup.h
   #include tree-nested.h
  +#include params.h
 
   static GTY(()) tree gcov_type_node;
   static GTY(()) tree tree_interval_profiler_fn;
  @@ -101,7 +102,10 @@ init_ic_make_global_vars (void)
   {
 ic_void_ptr_var
  = build_decl (UNKNOWN_LOCATION, VAR_DECL,
  - get_identifier (__gcov_indirect_call_callee),
  + get_identifier (
  + (PARAM_VALUE (PARAM_INDIR_CALL_TOPN_PROFILE) 
  ?
  +  __gcov_indirect_call_topn_callee :
  +  __gcov_indirect_call_callee)),
ptr_void);
 TREE_PUBLIC (ic_void_ptr_var) = 1;
 DECL_EXTERNAL (ic_void_ptr_var) = 1;
  @@ -131,7 +135,10 @@ init_ic_make_global_vars (void)
   {
 ic_gcov_type_ptr_var
  = build_decl (UNKNOWN_LOCATION, VAR_DECL,
  - get_identifier (__gcov_indirect_call_counters),
  + get_identifier (
  + (PARAM_VALUE (PARAM_INDIR_CALL_TOPN_PROFILE) 
  ?
  +  __gcov_indirect_call_topn_counters :
  +  __gcov_indirect_call_counters)),
gcov_type_ptr);
 TREE_PUBLIC (ic_gcov_type_ptr_var) = 1;
 DECL_EXTERNAL (ic_gcov_type_ptr_var) = 1;
  @@ -226,8 +233,10 @@ gimple_init_edge_profiler (void)
ptr_void,
NULL_TREE);
tree_indirect_call_profiler_fn
  - = build_fn_decl (__gcov_indirect_call_profiler_v2,
  -ic_profiler_fn_type);
  + = build_fn_decl ( (PARAM_VALUE 
  (PARAM_INDIR_CALL_TOPN_PROFILE) ?
  +__gcov_indirect_call_topn_profiler:
  +__gcov_indirect_call_profiler_v2),
  +  ic_profiler_fn_type);
   }
 TREE_NOTHROW (tree_indirect_call_profiler_fn) = 1;
 DECL_ATTRIBUTES (tree_indirect_call_profiler_fn)
  @@ -398,6 +407,12 @@ gimple_gen_ic_profiler (histogram_value value, uns
 gimple_stmt_iterator gsi 

[google gcc-4_9] fix undefined references in debug_info

2014-10-03 Thread Rong Xu
Hi,

This patch fixed a bug exposed in build kernel with fdo.

We cannot simply overwrite the bb footer in  emit_barrier_after_bb as
the bb may already have a footer (in this case, a deleted label stmt).
We need to output this label because it's a user label and debug_info
has a reference to it.

Tested with problematic file and regression test.
Trunk may also have the same issue, but I need to work on a testcase.

Thanks,

-Rong
2014-10-02  Rong Xu  x...@google.com

* gcc/cfgrtl.c (emit_barrier_after_bb): Append footer instead of
overwriting.

Index: gcc/cfgrtl.c
===
--- gcc/cfgrtl.c(revision 215823)
+++ gcc/cfgrtl.c(working copy)
@@ -1453,7 +1453,20 @@ emit_barrier_after_bb (basic_block bb)
   gcc_assert (current_ir_type () == IR_RTL_CFGRTL
   || current_ir_type () == IR_RTL_CFGLAYOUT);
   if (current_ir_type () == IR_RTL_CFGLAYOUT)
-BB_FOOTER (bb) = unlink_insn_chain (barrier, barrier);
+{
+  rtx insn = unlink_insn_chain (barrier, barrier);
+
+  if (BB_FOOTER (bb))
+   {
+  rtx footer_tail = BB_FOOTER (bb);
+  while (NEXT_INSN(footer_tail))
+footer_tail = NEXT_INSN (insn);
+ NEXT_INSN (footer_tail) = insn;
+ PREV_INSN (insn) = footer_tail;
+   }
+  else
+BB_FOOTER (bb) = insn;
+}
 }
 
 /* Like force_nonfallthru below, but additionally performs redirection


[PATCH] PR61889

2014-09-02 Thread Rong Xu
This patch makes the build of gcov-tool configurable. It checks if
ftw.h is available. For mingw build, it provides ftw functionality by
using FindFirstFile/FindNextFile/FindClose API.

Tested with and without --disable-gcov-tool.

Thanks,

-Rong
2014-09-02  Rong Xu  x...@google.com

* gcc/Makefile.in: Make the build gcov-tool configurable.
* gcc/configure.ac: Ditto.
* gcc/configure: Ditto.
* gcc/config.in: Ditto.
* gcc/gcov-tool.c (unlink_gcda_file): Support win32 build.
(unlink_profile_dir): Ditto.
* libgcc/libgcov-util.c (read_gcda_file): Ditto.
(read_file_handler): Ditto.
(ftw_read_file): Ditto.
(myftw): Ditto.
(gcov_read_profile_dir): Ditto.
(gcov_profile_normalize): Ditto.

Index: gcc/Makefile.in
===
--- gcc/Makefile.in (revision 214831)
+++ gcc/Makefile.in (working copy)
@@ -123,9 +123,13 @@ SUBDIRS =@subdirs@ build
 
 # Selection of languages to be made.
 CONFIG_LANGUAGES = @all_selected_languages@
-LANGUAGES = c gcov$(exeext) gcov-dump$(exeext) gcov-tool$(exeext) \
-$(CONFIG_LANGUAGES)
+LANGUAGES = c gcov$(exeext) gcov-dump$(exeext) $(CONFIG_LANGUAGES)
 
+disable_gcov_tool = @disable_gcov_tool@
+ifneq ($(disable_gcov_tool),yes)
+LANGUAGES += gcov-tool$(exeext)
+endif
+
 # Default values for variables overridden in Makefile fragments.
 # CFLAGS is for the user to override to, e.g., do a cross build with -O2.
 # TCFLAGS is used for compilations with the GCC just built.
Index: gcc/configure.ac
===
--- gcc/configure.ac(revision 214831)
+++ gcc/configure.ac(working copy)
@@ -5650,6 +5650,26 @@ if test ${ENABLE_LIBQUADMATH_SUPPORT} != no ;
 fi
 
 
+# Check if gcov-tool can be built.
+AC_ARG_ENABLE(gcov-tool,
+[AS_HELP_STRING([--disable-gcov-tool],
+[disable the build of gcov-tool])])
+if test x$enable_gcov_tool = xno; then
+  disable_gcov_tool=yes
+else
+  AC_CHECK_HEADERS(ftw.h, [disable_gcov_tool=no],
+   [case $host_os in
+  win32 | cygwin* | mingw32*)
+disable_gcov_tool=no
+;;
+  *)
+disable_gcov_tool=yes
+;;
+esac])
+fi
+AC_SUBST(disable_gcov_tool)
+
+
 # Specify what hash style to use by default.
 AC_ARG_WITH([linker-hash-style],
 [AC_HELP_STRING([--with-linker-hash-style={sysv,gnu,both}],
Index: gcc/configure
===
--- gcc/configure   (revision 214831)
+++ gcc/configure   (working copy)
@@ -600,6 +600,7 @@ ac_includes_default=\
 
 ac_subst_vars='LTLIBOBJS
 LIBOBJS
+disable_gcov_tool
 PICFLAG
 enable_host_shared
 enable_plugin
@@ -932,6 +933,7 @@ enable_version_specific_runtime_libs
 enable_plugin
 enable_host_shared
 enable_libquadmath_support
+enable_gcov_tool
 with_linker_hash_style
 '
   ac_precious_vars='build_alias
@@ -1655,6 +1657,7 @@ Optional Features:
   --enable-host-sharedbuild host code as shared libraries
   --disable-libquadmath-support
   disable libquadmath support for Fortran
+  --disable-gcov-tool disable the build of gcov-tool
 
 Optional Packages:
   --with-PACKAGE[=ARG]use PACKAGE [ARG=yes]
@@ -8353,7 +8356,7 @@ fi
 for ac_header in limits.h stddef.h string.h strings.h stdlib.h time.h iconv.h \
 fcntl.h unistd.h sys/file.h sys/time.h sys/mman.h \
 sys/resource.h sys/param.h sys/times.h sys/stat.h \
-direct.h malloc.h langinfo.h ldfcn.h locale.h wchar.h
+direct.h malloc.h langinfo.h ldfcn.h locale.h wchar.h ftw.h
 do :
   as_ac_Header=`$as_echo ac_cv_header_$ac_header | $as_tr_sh`
 ac_fn_c_check_header_preproc $LINENO $ac_header $as_ac_Header
@@ -18033,7 +18036,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat  conftest.$ac_ext _LT_EOF
-#line 18036 configure
+#line 18039 configure
 #include confdefs.h
 
 #if HAVE_DLFCN_H
@@ -18139,7 +18142,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat  conftest.$ac_ext _LT_EOF
-#line 18142 configure
+#line 18145 configure
 #include confdefs.h
 
 #if HAVE_DLFCN_H
@@ -28116,6 +28119,33 @@ $as_echo #define ENABLE_LIBQUADMATH_SUPPORT 1 
 fi
 
 
+# Check if gcov-tool can be built.
+# Check whether --enable-gcov-tool was given.
+if test ${enable_gcov_tool+set} = set; then :
+  enableval=$enable_gcov_tool;
+fi
+
+if test x$enable_gcov_tool = xno; then
+  disable_gcov_tool=yes
+else
+  ac_fn_c_check_header_preproc $LINENO ftw.h ac_cv_header_ftw_h
+if test x$ac_cv_header_ftw_h = xyes; then :
+  disable_gcov_tool=no
+else
+  case $host_os in
+  win32 | cygwin* | mingw32*)
+disable_gcov_tool

[patch] make gcov-tool build configurable and fix mingw build issue

2014-08-13 Thread Rong Xu
Hi,

This patch makes gcov-tool build configurable. A new configure option
--disable-gcov-tool is added to disable the build.

It also fixes some build issues for mingw.

Tested with regression and bootstrap. mingw test is ongoing.

OK to check in if mingw build passes?

Thanks,

-Rong


gcov_tool_patch_diff
Description: Binary data


Re: [BUILDROBOT] gcov patch

2014-07-11 Thread Rong Xu
Sorry. This code meant to work with the different mkdir api in
windows. I used wrong ifdef.

Here is the patch. OK for checkin?

Thanks,

-Rong

2014-07-11  Rong Xu  x...@google.com

* gcov-tool.c (gcov_output_files): Fix build error.

Index: gcov-tool.c
===
--- gcov-tool.c (revision 212448)
+++ gcov-tool.c (working copy)
@@ -90,8 +90,8 @@ gcov_output_files (const char *out, struct gcov_in
   /* Try to make directory if it doesn't already exist.  */
   if (access (out, F_OK) == -1)
 {
-#ifdef TARGET_POSIX_IO
-  if (mkdir (out, 0755) == -1  errno != EEXIST)
+#if !defined(_WIN32)
+  if (mkdir (out, S_IRWXU | S_IRWXG | S_IRWXO) == -1  errno != EEXIST)
 #else
   if (mkdir (out) == -1  errno != EEXIST)
 #endif

On Fri, Jul 11, 2014 at 6:08 AM, Jan-Benedict Glaw jbg...@lug-owl.de wrote:
 On Fri, 2014-07-11 15:03:06 +0200, Jan-Benedict Glaw jbg...@lug-owl.de 
 wrote:
 See eg. 
 http://toolchain.lug-owl.de/buildbot/show_build_details.php?id=289639,
 it breaks like this:

 [...]
 g++ -c   -g -O2 -DIN_GCC  -DCROSS_DIRECTORY_STRUCTURE  -fno-exceptions 
 -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wwrite-strings -Wcast-qual 
 -Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long 
 -Wno-variadic-macros -Wno-overlength-strings -fno-common  -DHAVE_CONFIG_H 
 -I. -I. -I/home/jbglaw/repos/gcc/gcc -I/home/jbglaw/repos/gcc/gcc/. 
 -I/home/jbglaw/repos/gcc/gcc/../include 
 -I/home/jbglaw/repos/gcc/gcc/../libcpp/include  
 -I/home/jbglaw/repos/gcc/gcc/../libdecnumber 
 -I/home/jbglaw/repos/gcc/gcc/../libdecnumber/dpd -I../libdecnumber 
 -I/home/jbglaw/repos/gcc/gcc/../libbacktrace-o gcov-tool.o -MT 
 gcov-tool.o -MMD -MP -MF ./.deps/gcov-tool.TPo 
 /home/jbglaw/repos/gcc/gcc/gcov-tool.c
 /usr/include/sys/stat.h: In function ‘void gcov_output_files(const char*, 
 gcov_info*)’:
 /usr/include/sys/stat.h:323: error: too few arguments to function ‘int 
 mkdir(const char*, __mode_t)’
 /home/jbglaw/repos/gcc/gcc/gcov-tool.c:96: error: at this point in file
 make[1]: *** [gcov-tool.o] Error 1
 make[1]: Leaving directory `/home/jbglaw/build/rl78-elf/build-gcc/gcc'
 make: *** [all-gcc] Error 2

 [...]

 +#ifdef TARGET_POSIX_IO
 +  if (mkdir (out, 0755) == -1  errno != EEXIST)
 +#else
 +  if (mkdir (out) == -1  errno != EEXIST)
 +#endif


 And with POSIX I/O, this should probably use the S_I* macros for the
 mode bits?

 MfG, JBG

 --
   Jan-Benedict Glaw  jbg...@lug-owl.de  +49-172-7608481
  Signature of:Arroganz verkürzt fruchtlose Gespräche.
  the second  :   -- Jan-Benedict Glaw


Re: [BUILDROBOT] gcov patch

2014-07-11 Thread Rong Xu
On Fri, Jul 11, 2014 at 8:06 AM, Jan Hubicka hubi...@ucw.cz wrote:
 Sorry. This code meant to work with the different mkdir api in
 windows. I used wrong ifdef.

 Here is the patch. OK for checkin?
 OK. I also see the following with LTO bootstrap:
 ../../gcc/../libgcc/libgcov-util.c:41:24: error: type of �gcov_max_filename� 
 does not match original declaration [-Werror]
  extern gcov_unsigned_t gcov_max_filename;
 ^
 ../../gcc/../libgcc/libgcov-driver.c:88:8: note: previously declared here
  size_t gcov_max_filename = 0;
 Probably both can be size_t?

OK. I will change this too and submit.

Thanks for the quick review.

-Rong


 Honza


 Thanks,

 -Rong

 2014-07-11  Rong Xu  x...@google.com

 * gcov-tool.c (gcov_output_files): Fix build error.

 Index: gcov-tool.c
 ===
 --- gcov-tool.c (revision 212448)
 +++ gcov-tool.c (working copy)
 @@ -90,8 +90,8 @@ gcov_output_files (const char *out, struct gcov_in
/* Try to make directory if it doesn't already exist.  */
if (access (out, F_OK) == -1)
  {
 -#ifdef TARGET_POSIX_IO
 -  if (mkdir (out, 0755) == -1  errno != EEXIST)
 +#if !defined(_WIN32)
 +  if (mkdir (out, S_IRWXU | S_IRWXG | S_IRWXO) == -1  errno != EEXIST)

 Sounds almost like something we could have libiberty glue for...


  #else
if (mkdir (out) == -1  errno != EEXIST)
  #endif



Re: [PATCH] offline gcda profile processing tool

2014-07-11 Thread Rong Xu
I should use _WIN32 macro. This code is for windows mkdir api.
I'll commit a patch for this shortly (approved by honza).

-Rong

On Fri, Jul 11, 2014 at 8:49 AM, Xinliang David Li davi...@google.com wrote:
 What is the macro to test POSIX IO on host? The guard uses
 TARGET_POSIX_IO which is not right.

 David

 On Fri, Jul 11, 2014 at 1:12 AM, Christophe Lyon
 christophe.l...@linaro.org wrote:
 On 11 July 2014 10:07, Richard Biener richard.guent...@gmail.com wrote:
 On Mon, May 5, 2014 at 7:17 PM, Rong Xu x...@google.com wrote:
 Here is the updated patch. The difference with patch set 3 is
 (1) use the gcov-counter.def for scaling operation.
 (2) fix wrong scaling function for time-profiler.

 passed with bootstrap, profiledboostrap and SPEC2006.

 One of the patches breaks bootstrap for me:

 /space/rguenther/src/svn/trunk/gcc/../libgcc/libgcov.h:184: error: ISO
 C++ forbids zero-size array ‘ctrs’
 make[3]: *** [libgcov-util.o] Error 1

 host compiler is gcc 4.3.4

 Richard.


 On my side, commit 212448 breaks the cross-build of GCC for targets
 using newlib:
 * arm-none-eabi
 * aarch64-none-elf
 /usr/include/sys/stat.h: In function 8098void
 gcov_output_files(const char*, gcov_info*)8099:
 /usr/include/sys/stat.h:317: error: too few arguments to function
 8098int mkdir(const char*, __mode_t)8099
 /tmp/1392958_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/gcov-tool.c:96:
 error: at this point in file
 make[2]: *** [gcov-tool.o] Error 1

 Christophe.

 Thanks,

 -Rong

 On Thu, May 1, 2014 at 3:37 PM, Rong Xu x...@google.com wrote:
 Hi, Honza,

 Please find the new patch in the attachment. This patch integrated
 your earlier suggestions. The noticeable changes are:
 (1) build specialized object for libgcov-driver.c and libgcov-merge.c
 and link into gcov-tool, rather including the source file.
 (2) split some gcov counter definition code to gcov-counter.def to
 avoid code duplication.
 (3) use a macro for gcov_read_counter(), so gcov-tool can use existing
 code in libgcov-merge.c with minimal change.

 This patch does not address the suggestion of creating a new directory
 for libgcov. I agree with you that's a much better
 and cleaner structure we should go for. We can do that in follow-up 
 patches.

 I tested this patch with boostrap and profiledbootstrap. Other tests
 are on-going.

 Thanks,

 -Rong

 On Wed, Apr 16, 2014 at 8:34 PM, Jan Hubicka hubi...@ucw.cz wrote:
 GCOT_TOOL needs to use this function to read the string in gcda file
 to memory to construct gcov_info objects.
 As you noticed, gcov runtime does not need this interface. But
 gcov-tool links with gcov runtime and it also uses the function.
 We could make it available in gcov_runtime, but that will slightly
 increase the memory footprint.

 Yep, it is not really pretty. I wrote bellow some plan how things may be
 structured in more convenient way.  Any work in that direction would be 
 welcome.

 The planned new functions for trunk version is: (1) overlap score b/w
 two profiles (2) better dumping (top hot objects/function/counters)
 and statistics.
 Once this basic version is in, we can start to add the new 
 functionality.

 Sounds good. I assume the autoFDO does not go via gcov tool but rather 
 uses
 custom reader of profile data at GCC side?
 I wonder, are there any recent overview papers/slides/text of design of 
 AutoFDO
 and other features to be merged?
 I remember the talk from GCC Summit and I did read some of pre-LTO LIPO
 sources  papers, but it would be nice to have somethin up to date.

 libgcov-util.o is built in gcc/ directory, rather in libgcc.
 It's directly linked to gcov-tool.
 So libgcov-util.o is built for HOST, not TARGET.
 With makefile changes, we can built HOST version of libgcov-driver.o
 and libgcov-merge.o.
 I also need to make some static functions/variables public.

 I suppose that can go with IN_GCOV_TOOL ifdef.

 So we currently have basic gcov io handling in gcc/gcov-io.c that is 
 borrowed
 by libgcc/libgcov* code.  We also will get libgcov-util.c in libgcc 
 directory
 that is actually borrowed by by gcc/gcov-tool.c code.

 We now have one runtime using STDIO for streaming and kernel has custom 
 version
 that goes via /proc interface (last time I looked).  We added some 
 abstraction
 into libgcov-interface that is the part that relies on STDIO, partly via 
 gcov-io.c
 code and now we have in-memory handling code in libgcov-util.

 I guess it would make most sentse to put all the gcov code into a new 
 directory
 (libgcov) and make it stand-alone library that can be configured
 1) for stdio based runtime as we do not
 2) for runtime missing the interface and relyin on user providing it
 3) for use within gcov file manipulation tools with reorg of
 GCC/gcov/gcov-dump/gcov-tool to all use the same low-level interfaces.
 In a longer term, I would like to make FDO profiling intstrumentation to 
 happen
 at linktime. For that I need to make the instrumentation code (minimal 
 spaning
 tree  friends

Re: [PATCH] offline gcda profile processing tool

2014-07-11 Thread Rong Xu
I did see the warning in the bootstrap, but it did not exit the build.
I thought it was ok.

I'll have a patch for this and send for review.

-Rong

On Fri, Jul 11, 2014 at 9:13 AM, Xinliang David Li davi...@google.com wrote:
 right.

 Rong, the fix would be just change ctr array size to 1. For each
 function, there should be at least one kind of counters -- see the
 assertion in build_fn_info_type.  There are some code that do
 'sizeof(gcov_fn_info)' when computing heap size -- they can be
 adjusted or leave it as it is (if not doing memcpy for the whole
 array).

 David

 On Fri, Jul 11, 2014 at 8:44 AM, Jakub Jelinek ja...@redhat.com wrote:
 On Fri, Jul 11, 2014 at 08:42:27AM -0700, Xinliang David Li wrote:
 I wonder why. The struct definition for gcov_fn_info has not changed
 in this patch.

 Perhaps it has been used only in C until now?

 Jakub


Re: [PATCH] offline gcda profile processing tool

2014-07-11 Thread Rong Xu
Richard,

I looked at my patch again. I already add -Wno-error to libgcov-util.o
compilation:
In line 200 of gcc/Makefile.in
libgcov-util.o-warn = -Wno-error

In my test, I used gcc-4.6 as the host compiler. I got warning like this:

In file included from ../../gcc/gcc/../libgcc/libgcov-util.c:30:0:
../../gcc/gcc/../libgcc/libgcov.h:184:30: warning: ISO C++ forbids
zero-size array ‘ctrs’ [-pedantic]

Can you check your buildlog to see if -Wno-error is added to the command line?

Thanks,

-Rong

On Fri, Jul 11, 2014 at 9:47 AM, Rong Xu x...@google.com wrote:
 I did see the warning in the bootstrap, but it did not exit the build.
 I thought it was ok.

 I'll have a patch for this and send for review.

 -Rong

 On Fri, Jul 11, 2014 at 9:13 AM, Xinliang David Li davi...@google.com wrote:
 right.

 Rong, the fix would be just change ctr array size to 1. For each
 function, there should be at least one kind of counters -- see the
 assertion in build_fn_info_type.  There are some code that do
 'sizeof(gcov_fn_info)' when computing heap size -- they can be
 adjusted or leave it as it is (if not doing memcpy for the whole
 array).

 David

 On Fri, Jul 11, 2014 at 8:44 AM, Jakub Jelinek ja...@redhat.com wrote:
 On Fri, Jul 11, 2014 at 08:42:27AM -0700, Xinliang David Li wrote:
 I wonder why. The struct definition for gcov_fn_info has not changed
 in this patch.

 Perhaps it has been used only in C until now?

 Jakub


Re: [PATCH] offline gcda profile processing tool

2014-07-11 Thread Rong Xu
On Fri, Jul 11, 2014 at 11:46 AM, Jan Hubicka hubi...@ucw.cz wrote:
 Richard,

 I looked at my patch again. I already add -Wno-error to libgcov-util.o
 compilation:
 In line 200 of gcc/Makefile.in
 libgcov-util.o-warn = -Wno-error

 In my test, I used gcc-4.6 as the host compiler. I got warning like this:

 In file included from ../../gcc/gcc/../libgcc/libgcov-util.c:30:0:
 ../../gcc/gcc/../libgcc/libgcov.h:184:30: warning: ISO C++ forbids
 zero-size array ???ctrs??? [-pedantic]

 Can you check your buildlog to see if -Wno-error is added to the command 
 line?

 I would preffer the libgcov.h (that is interface header to libgcov users) to 
 be
 valid C++, so we still ought to fix it.

 Honza

OK. I will send out a patch for review.

-Rong


 Thanks,

 -Rong

 On Fri, Jul 11, 2014 at 9:47 AM, Rong Xu x...@google.com wrote:
  I did see the warning in the bootstrap, but it did not exit the build.
  I thought it was ok.
 
  I'll have a patch for this and send for review.
 
  -Rong
 
  On Fri, Jul 11, 2014 at 9:13 AM, Xinliang David Li davi...@google.com 
  wrote:
  right.
 
  Rong, the fix would be just change ctr array size to 1. For each
  function, there should be at least one kind of counters -- see the
  assertion in build_fn_info_type.  There are some code that do
  'sizeof(gcov_fn_info)' when computing heap size -- they can be
  adjusted or leave it as it is (if not doing memcpy for the whole
  array).
 
  David
 
  On Fri, Jul 11, 2014 at 8:44 AM, Jakub Jelinek ja...@redhat.com wrote:
  On Fri, Jul 11, 2014 at 08:42:27AM -0700, Xinliang David Li wrote:
  I wonder why. The struct definition for gcov_fn_info has not changed
  in this patch.
 
  Perhaps it has been used only in C until now?
 
  Jakub


Re: [Google] Fix AFDO early inline ICEs due to DFE

2014-06-12 Thread Rong Xu
This looks fine to me.

-Rong

On Thu, Jun 12, 2014 at 11:23 AM, Teresa Johnson tejohn...@google.com wrote:
 These two patches fix multiple ICE that occurred due to DFE being
 recently enabled after AutoFDO LIPO linking.

 Passes regression and internal testing. Ok for Google/4_8?

 Teresa

 2014-06-12  Teresa Johnson  tejohn...@google.com
 Dehao Chen  de...@google.com

 Google ref b/15521327.

 * cgraphclones.c (cgraph_clone_edge): Use resolved node.
 * l-ipo.c (resolve_cgraph_node): Resolve to non-removable node.

 Index: cgraphclones.c
 ===
 --- cgraphclones.c  (revision 211386)
 +++ cgraphclones.c  (working copy)
 @@ -94,6 +94,7 @@ along with GCC; see the file COPYING3.  If not see
  #include ipa-utils.h
  #include lto-streamer.h
  #include except.h
 +#include l-ipo.h

  /* Create clone of E in the node N represented by CALL_EXPR the callgraph.  
 */
  struct cgraph_edge *
 @@ -118,7 +119,11 @@ cgraph_clone_edge (struct cgraph_edge *e, struct c

if (call_stmt  (decl = gimple_call_fndecl (call_stmt)))
 {
 - struct cgraph_node *callee = cgraph_get_node (decl);
 +  struct cgraph_node *callee;
 +  if (L_IPO_COMP_MODE  cgraph_pre_profiling_inlining_done)
 +callee = cgraph_lipo_get_resolved_node (decl);
 +  else
 +callee = cgraph_get_node (decl);
   gcc_checking_assert (callee);
   new_edge = cgraph_create_edge (n, callee, call_stmt, count, freq);
 }
 Index: l-ipo.c
 ===
 --- l-ipo.c (revision 211386)
 +++ l-ipo.c (working copy)
 @@ -1542,6 +1542,18 @@ resolve_cgraph_node (struct cgraph_sym **slot, str
gcc_assert (decl1_defined);
add_define_module (*slot, decl2);

 +  /* Pick the node that cannot be removed, to avoid a situation
 + where we remove the resolved node and later try to access
 + it for the remaining non-removable copy.  E.g. one may be
 + extern and the other weak, only the extern copy can be removed.  */
 +  if (cgraph_can_remove_if_no_direct_calls_and_refs_p ((*slot)-rep_node)
 +   !cgraph_can_remove_if_no_direct_calls_and_refs_p (node))
 +{
 +  (*slot)-rep_node = node;
 +  (*slot)-rep_decl = decl2;
 +  return;
 +}
 +
has_prof1 = has_profile_info (decl1);
bool is_aux1 = cgraph_is_auxiliary (decl1);
bool is_aux2 = cgraph_is_auxiliary (decl2);


 --
 Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413


Re: [Google/4_8] Reduce memory overhead of LIPO COMDAT fixups

2014-06-05 Thread Rong Xu
This patch looks good to me.

-Rong

On Thu, Jun 5, 2014 at 12:45 PM, Teresa Johnson tejohn...@google.com wrote:
 (cc'ing a few additional people to help with review as David is out
 and I'm not sure Rong is available)

 This patch greatly reduces the memory overhead of the new COMDAT fixup
 analysis, by changing the second level hash tables to linked lists.
 I found that almost none of the second level hash tables contained more
 than one entry, but each hash table required a lot of memory overhead.

 I also now query the fixup type before adding the checksums to the
 pointer sets during callgraph building, which would have enabled a workaround
 for the memory issue.

 Tested with regression tests and internal tests (see ref below for details).

 Google ref b/15415042.

 2014-06-05  Teresa Johnson  tejohn...@google.com

 * dyn-ipa.c (struct lineno_checksum_alias): Replaced pointer set.
 (struct checksum_alias_info): Enabled linked list.
 (cfg_checksum_get_key): Removed.
 (find_cfg_checksum): New function.
 (cfg_checksum_insert): Operate on linked list.
 (checksum_set_insert): Ditto.
 (gcov_build_callgraph): Allow disabling checksum insertion.
 (gcov_find_new_ic_target): Operate on linked list.
 (gcov_fixup_counters_checksum): Ditto.
 (gcov_fixup_counters_lineno): Ditto.
 (__gcov_compute_module_groups): Compute fixup type earlier.

 Index: dyn-ipa.c
 ===
 --- dyn-ipa.c   (revision 211288)
 +++ dyn-ipa.c   (working copy)
 @@ -79,16 +79,15 @@ struct dyn_cgraph
unsigned num_nodes_executed;
/* used by new algorithm  */
struct modu_node *modu_nodes;
 -  /* Set indexed by lineno_checksum, returns another dyn_pointer_set*,
 - indexed by cfg_checksum.  That returns a checksum_alias_info struct.  */
 +  /* Set indexed by lineno_checksum, returns a linked list of
 + checksum_alias_info structs.  */
struct dyn_pointer_set *lineno_pointer_sets;
  };

  /* Struct holding information for functions with the same lineno_checksum.  
 */
  struct lineno_checksum_alias
  {
 -  /* Set indexed by cfg_checksum, holding a checksum_alias_info struct.  */
 -  struct dyn_pointer_set *cfg_pointer_set;
 +  struct checksum_alias_info *cfg_checksum_list;
unsigned lineno_checksum;
  };

 @@ -96,6 +95,7 @@ struct lineno_checksum_alias
 checksums.  */
  struct checksum_alias_info
  {
 +  struct checksum_alias_info *next_cfg_checksum;
struct checksum_alias *alias_list;
unsigned cfg_checksum;
  };
 @@ -205,6 +205,7 @@ pointer_set_create (unsigned (*get_key) (const voi
  static struct dyn_cgraph the_dyn_call_graph;
  static int total_zero_count = 0;
  static int total_insane_count = 0;
 +static int fixup_type = 0;

  enum GROUPING_ALGORITHM
  {
 @@ -374,14 +375,6 @@ lineno_checksum_get_key (const void *p)
return ((const struct lineno_checksum_alias *) p)-lineno_checksum;
  }

 -/* The cfg_checksum value in P is the key for a cfg_pointer_set.  */
 -
 -static inline unsigned
 -cfg_checksum_get_key (const void *p)
 -{
 -  return ((const struct checksum_alias_info *) p)-cfg_checksum;
 -}
 -
  /* Create a new checksum_alias struct for function with GUID, FI_PTR,
 and ZERO_COUNTS flag.  Prepends to list NEXT and returns new struct.  */

 @@ -398,28 +391,44 @@ new_checksum_alias (gcov_type guid, const struct g
return alias;
  }

 -/* Insert a new checksum_alias struct into pointer set P for function with
 +/* Locate the checksum_alias_info in LIST that matches CFG_CHECKSUM.  */
 +
 +static struct checksum_alias_info *
 +find_cfg_checksum (struct checksum_alias_info *list, unsigned cfg_checksum)
 +{
 +  for (; list; list = list-next_cfg_checksum)
 +{
 +  if (list-cfg_checksum == cfg_checksum)
 +return list;
 +}
 +  return NULL;
 +}
 +
 +/* Insert a new checksum_alias struct into LIST for function with
 CFG_CHECKSUM and associated GUID, FI_PTR, and ZERO_COUNTS flag.  */

 -static void
 -cfg_checksum_set_insert (struct dyn_pointer_set *p, unsigned cfg_checksum,
 - gcov_type guid, const struct gcov_fn_info *fi_ptr,
 - int zero_counts)
 +static struct checksum_alias_info *
 +cfg_checksum_insert (unsigned cfg_checksum, gcov_type guid,
 + const struct gcov_fn_info *fi_ptr, int zero_counts,
 + struct checksum_alias_info *list)
  {
 -  struct checksum_alias_info **m = (struct checksum_alias_info **)
 -pointer_set_find_or_insert (p, cfg_checksum);
 -  if (*m)
 +  struct checksum_alias_info *alias_info;
 +  alias_info = find_cfg_checksum (list, cfg_checksum);
 +  if (alias_info)
  {
 -  gcc_assert ((*m)-alias_list);
 -  (*m)-alias_list = new_checksum_alias (guid, fi_ptr, zero_counts,
 - (*m)-alias_list);
 +  gcc_assert (alias_info-alias_list);
 +  alias_info-alias_list = 

Re: [Patch] Avoid gcc_assert in libgcov

2014-05-22 Thread Rong Xu
I think these asserts will be used by gcov-tool. So I prefer to change
them to  gcov_nonruntime_assert(). I'll merge them in my new gcov-tool
patch before submitting (waiting for honaz's ok).

Thanks,

-Rong

On Thu, May 22, 2014 at 7:09 AM, Teresa Johnson tejohn...@google.com wrote:
 On Thu, Jan 16, 2014 at 2:41 PM, Jan Hubicka hubi...@ucw.cz wrote:
 On Wed, Jan 15, 2014 at 8:39 PM, Jan Hubicka hubi...@ucw.cz wrote:
 
  In that case should we call gcov_error when IN_LIBGCOV? One
  possibility would be to simply make gcov_nonruntime_assert be defined
  as if (!EXPR) gcov_error in the IN_LIBGCOV case. But I think what you
  wanted here was to reduce libgcov bloat by removing calls altogether,
  which this wouldn't solve. But if we want to call gcov_error in some
  cases, I think I need to add another macro that will either do
  gcc_assert when !IN_LIBGCOV and if (!EXPR) gcov_error when
  IN_LIBGCOV. Is that what you had in mind?
 
  I think for errors that can be triggered by data corruption, we ought to
  produce resonable error messages in both IN_LIBGCOV or for offline tools.
  Just unwound sequence if(...) gcov_error seems fine to me in this case,
  but we may also have assert like wrapper.
  I see we do not provide gcov_error outside libgcov, we probably ought to 
  map
  it to fatal_error in GCC binary.
 
  thanks,
  Honza

 Ok, here is the new patch. Bootstrapped and tested on
 x86_64-unknown-linux-gnu. Ok for trunk?

 Thanks, Teresa

 2014-01-16  Teresa Johnson  tejohn...@google.com

 * gcov-io.c (gcov_position): Use gcov_nonruntime_assert.
 (gcov_is_error): Remove gcc_assert from IN_LIBGCOV code.
 (gcov_rewrite): Use gcov_nonruntime_assert.
 (gcov_open): Ditto.
 (gcov_write_words): Ditto.
 (gcov_write_length): Ditto.
 (gcov_read_words): Use gcov_nonruntime_assert, and remove
 gcc_assert from IN_LIBGCOV code.
 (gcov_read_summary): Use gcov_error to flag profile corruption.
 (gcov_sync): Use gcov_nonruntime_assert.
 (gcov_seek): Remove gcc_assert from IN_LIBGCOV code.
 (gcov_histo_index): Use gcov_nonruntime_assert.
 (static void gcov_histogram_merge): Ditto.
 (compute_working_sets): Ditto.
 * gcov-io.h (gcov_nonruntime_assert): Define.
 (gcov_error): Define for !IN_LIBGCOV

 OK, thanks!
 Honza

 I just found this old patch sitting in a client! Committed as r210805.

 I also discovered that a couple uses of gcc_assert have snuck into
 libgcc/libgcov* files in the meantime. Looks like this got added
 during some of Rong's refactoring, cc'ing Rong. They are in
 libgcc/libgcov-driver-system.c (allocate_filename_struct) and
 libgcov-merge.c (__gcov_merge_single and __gcov_merge_delta). Should I
 remove those or change to gcov_error?

 Teresa



 --
 Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413


[google gcc-4_8] LIPO: check the incompatibility of use and saved command line

2014-05-12 Thread Rong Xu
Hi,

This patch adds an extra check of the incompatibility of use and saved
command line for the primary module. If the check files, we skip the
inclusion of all aux modules.
Maybe we should check the aux module command line with the
use-compilation command line directly. But I keep the old way that
checks b/w saved command lines.

Tested with the program exposed this issue and internal benchmarks.
This check never failed for instrumentation based lipo in the test.

Thanks,

-Rong
2014-05-12  Rong Xu  x...@google.com

* coverage.c (struct string_hasher): Use const char.
(has_incompatible_cg_opts): Add new incompatible options.
(incompatible_cl_arg_strs): Refactor from incompatible_cl_args.
(incompatible_cl_args): Refactor.
(read_counts_file): check incompatibility of use command line and the
saved command line for the primary module.
* auto-profile.c (read_aux_modules): Ditto.

Index: coverage.c
===
--- coverage.c  (revision 209291)
+++ coverage.c  (working copy)
@@ -237,8 +237,8 @@ is_last_module (unsigned mod_id)
 struct string_hasher
 {
   /* hash_table support.  */
-  typedef  char value_type;
-  typedef  char compare_type;
+  typedef const char value_type;
+  typedef char compare_type;
   static inline hashval_t hash (const value_type *);
   static int equal (const value_type *, const compare_type *);
   static void remove (value_type *) {};
@@ -272,6 +272,10 @@ static struct opt_desc force_matching_cg_opts[] =
 { -fexceptions, -fno-exceptions, true },
 { -fsized-delete, -fno-sized-delete, false },
 { -frtti, -fno-rtti, true },
+{ -fPIC, -fno-PIC, false },
+{ -fpic, -fno-pic, false },
+{ -fPIE, -fno-PIE, false },
+{ -fpie, -fno-pie, false },
 { -fstrict-aliasing, -fno-strict-aliasing, true },
 { -fsigned-char, -funsigned-char, true },
 /* { -fsigned-char, -fno-signed-char, true },
@@ -315,28 +319,25 @@ has_incompatible_cg_opts (bool *cg_opts1, bool *cg
   return false;
 }
 
-/* Returns true if the command-line arguments stored in the given module-infos
-   are incompatible.  */
-bool
-incompatible_cl_args (struct gcov_module_info* mod_info1,
- struct gcov_module_info* mod_info2)
+/* Returns true if the command-line arguments stored ARGS_1 and
+   ARGS_2 are incompatible. N_ARGS_1 and N_ARGS_2 are the number
+   of string in ARGS_1 and ARGS_2, respectively.  */
+
+static bool
+incompatible_cl_arg_strs (const char* const* args_1, const unsigned int 
n_args_1,
+  const char *filename1, const char* const* args_2,
+  const unsigned int n_args_2, const char *filename2)
 {
-  char **warning_opts1 = XNEWVEC (char *, mod_info1-num_cl_args);
-  char **warning_opts2 = XNEWVEC (char *, mod_info2-num_cl_args);
-  char **non_warning_opts1 = XNEWVEC (char *, mod_info1-num_cl_args);
-  char **non_warning_opts2 = XNEWVEC (char *, mod_info2-num_cl_args);
-  char *std_opts1 = NULL, *std_opts2 = NULL;
+  const char **warning_opts1 = XNEWVEC (const char *, n_args_1);
+  const char **warning_opts2 = XNEWVEC (const char *, n_args_2);
+  const char **non_warning_opts1 = XNEWVEC (const char *, n_args_1);
+  const char **non_warning_opts2 = XNEWVEC (const char *, n_args_2);
+  const char *std_opts1 = NULL, *std_opts2 = NULL;
   unsigned int i, num_warning_opts1 = 0, num_warning_opts2 = 0;
   unsigned int num_non_warning_opts1 = 0, num_non_warning_opts2 = 0;
   bool warning_mismatch = false;
   bool non_warning_mismatch = false;
   hash_table string_hasher option_tab1, option_tab2;
-  unsigned int start_index1 = mod_info1-num_quote_paths 
-+ mod_info1-num_bracket_paths + mod_info1-num_system_paths 
-+ mod_info1-num_cpp_defines + mod_info1-num_cpp_includes;
-  unsigned int start_index2 = mod_info2-num_quote_paths
-+ mod_info2-num_bracket_paths + mod_info2-num_system_paths
-+ mod_info2-num_cpp_defines + mod_info2-num_cpp_includes;
 
   bool *cg_opts1, *cg_opts2, has_any_incompatible_cg_opts, 
has_incompatible_std;
   unsigned int num_cg_opts = 0;
@@ -358,14 +359,13 @@ has_incompatible_cg_opts (bool *cg_opts1, bool *cg
   option_tab2.create (10);
 
   /* First, separate the warning and non-warning options.  */
-  for (i = 0; i  mod_info1-num_cl_args; i++)
-if (mod_info1-string_array[start_index1 + i][1] == 'W')
-  warning_opts1[num_warning_opts1++] =
-   mod_info1-string_array[start_index1 + i];
+  for (i = 0; i  n_args_1; i++)
+if (args_1[i][1] == 'W')
+  warning_opts1[num_warning_opts1++] = args_1[i];
 else
   {
-char **slot;
-char *option_string = mod_info1-string_array[start_index1 + i];
+const char **slot;
+const char *option_string = args_1[i];
 
 check_cg_opts (cg_opts1, option_string);
if (strstr (option_string, -std=))
@@ -379,14 +379,14 @@ has_incompatible_cg_opts (bool *cg_opts1, bool *cg

[google gcc-4_8] fix lipo ICE

2014-05-08 Thread Rong Xu
Hi,

This patch fixed lipo ICE triggered by an out-of-bound access.

This is google specific patch and tested with bootstrap and the
program exposed the issue.

Thanks,

-Rong
2014-05-08  Rong Xu  x...@google.com

* tree-inline.c (add_local_variables): Check if the debug_expr
is a decl_node before calling is_global_var.

Index: tree-inline.c
===
--- tree-inline.c   (revision 209291)
+++ tree-inline.c   (working copy)
@@ -3842,7 +3842,7 @@ add_local_variables (struct function *callee, stru
of varpool node does not check the reference
from debug expressions.
Set it to 0 for all global vars.  */
-if (L_IPO_COMP_MODE  tem  is_global_var (tem))
+if (L_IPO_COMP_MODE  tem  DECL_P (tem)  is_global_var (tem))
   tem = NULL;
 
id-remapping_type_depth++;


Re: [PATCH] offline gcda profile processing tool

2014-04-16 Thread Rong Xu
On Tue, Apr 15, 2014 at 2:38 PM, Jan Hubicka hubi...@ucw.cz wrote:
 Rong, David, Dehao, Teresa
 I would like to have some rought idea of what we could merge this stage1. 
 There is
 certainly a lot of interesting stuff on the google branch including AutoFDO, 
 LIPO,
 the multivalue profile counters that may be used by the new devirtualization 
 bits
 and more. I also think we should switch counts into floating point 
 representation
 so Teresa's splitting patch works.

 Can we get plans to make this effective? My personal schedule is quite free 
 until
 April 29 when I go to Czech Republic for wedding and I will be back in Calgary
 at 14th.

 2014-03-03  Rong Xu  x...@google.com

   * gcc/gcov-io.c (gcov_read_string): Make this routine available
 to gcov-tool.
   (gcov_sync): Ditto.
   * gcc/Makefile.in: Build and install gcov-tool.
   * gcc/gcov-tool.c (unlink_gcda_file): Remove one gcda file.
   (unlink_profile_dir): Remove gcda files from the profile path.
   (profile_merge): Merge two profiles in directory.
   (print_merge_usage_message): Print merge usage.
   (merge_usage): Print merge usage and exit.
   (do_merge): Driver for profile merge sub-command.
   (profile_rewrite): Rewrite profile.
   (print_rewrite_usage_message): Print rewrite usage.
   (rewrite_usage): Print rewrite usage and exit.
   (do_rewrite): Driver for profile rewrite sub-command.
   (print_usage): Print gcov-info usage and exit.
   (print_version): Print gcov-info version.
   (process_args): Process arguments.
   (main): Main routine for gcov-tool.
   * libgcc/libgcov.h : Include the set of base-type headers for
 gcov-tool.
 (struct gcov_info): Make the functions field mutable in gcov-tool
 compilation.
   * libgcc/libgcov-merge.c (gcov_get_counter): New wrapper function
 to get the profile counter.
   (gcov_get_counter_target): New wrapper function to get the profile
 values that should not be scaled.
   (__gcov_merge_add): Replace gcov_read_counter() with the wrapper
 functions.
   (__gcov_merge_ior): Ditto.
   (__gcov_merge_time_profile): Ditto.
   (__gcov_merge_single): Ditto.
   (__gcov_merge_delta): Ditto.
   * libgcc/libgcov-util.c (void gcov_set_verbose): Set the verbose flag
 in the utility functions.
   (set_fn_ctrs): Utility function for reading gcda files to in-memory
 gcov_list object link lists.
   (tag_function): Ditto.
   (tag_blocks): Ditto.
   (tag_arcs): Ditto.
   (tag_lines): Ditto.
   (tag_counters): Ditto.
   (tag_summary): Ditto.
   (read_gcda_finalize): Ditto.
   (read_gcda_file): Ditto.
   (ftw_read_file): Ditto.
   (read_profile_dir_init): Ditto.
   (gcov_read_profile_dir): Ditto.
   (gcov_read_counter_mem): Ditto.
   (gcov_get_merge_weight): Ditto.
   (merge_wrapper): A wrapper function that calls merging handler.
   (gcov_merge): Merge two gcov_info objects with weights.
   (find_match_gcov_info): Find the matched gcov_info in the list.
   (gcov_profile_merge): Merge two gcov_info object lists.
   (__gcov_add_counter_op): Process edge profile counter values.
   (__gcov_ior_counter_op): Process IOR profile counter values.
   (__gcov_delta_counter_op): Process delta profile counter values.
   (__gcov_single_counter_op): Process single  profile counter values.
   (fp_scale): Callback function for float-point scaling.
   (int_scale): Callback function for integer fraction scaling.
   (gcov_profile_scale): Scaling profile counters.
   (gcov_profile_normalize): Normalize profile counters.

 Index: gcc/gcov-io.c
 ===
 --- gcc/gcov-io.c (revision 208237)
 +++ gcc/gcov-io.c (working copy)
 @@ -564,7 +564,7 @@ gcov_read_counter (void)
 buffer, or NULL on empty string. You must copy the string before
 calling another gcov function.  */

 -#if !IN_LIBGCOV
 +#if !IN_LIBGCOV || defined (IN_GCOV_TOOL)
  GCOV_LINKAGE const char *
  gcov_read_string (void)
  {
 @@ -641,7 +641,7 @@ gcov_read_summary (struct gcov_summary *summary)
  }
  }

 -#if !IN_LIBGCOV
 +#if !IN_LIBGCOV || defined (IN_GCOV_TOOL)
  /* Reset to a known position.  BASE should have been obtained from
 gcov_position, LENGTH should be a record length.  */

 I am slightly confused here, IN_LIBGCOV IMO means that the gcov-io is going
 to be linked into the gcov runtime as opposed to gcc, gcov, gcov-dump or
 gcov-tool.  Why we define IN_LIBGCOV  IN_GCOV_TOOL?

GCOT_TOOL needs to use this function to read the string in gcda file
to memory to construct gcov_info objects.
As you noticed, gcov runtime does not need this interface. But
gcov-tool links with gcov runtime and it also uses the function.
We could make it available in gcov_runtime, but that will slightly
increase the memory footprint

Re: [google gcc-4_8] unify int and fp scaling in gcov-tool

2014-02-10 Thread Rong Xu
Yes. They were the bug I mentioned. This patch fixed that.

-Rong

On Mon, Feb 10, 2014 at 10:40 AM, Teresa Johnson tejohn...@google.com wrote:
 Looks good to me.

 A couple questions:

  static void
 -__gcov_scale_icall_topn (gcov_type *counters, unsigned n_counters, double f)
 +__gcov_icall_topn_op (gcov_type *counters, unsigned n_counters,
 +  counter_op_fn fn, void *data1, void *data2)
  {
unsigned i;

 @@ -950,41 +949,65 @@ static void
gcov_type *value_array = counters[i + 1];

for (j = 0; j  GCOV_ICALL_TOPN_NCOUNTS - 1; j += 2)
 -value_array[1] *= f;

 Was the above a bug (always updating index 1)?

 +value_array[j + 1] = fn (value_array[j + 1], data1, data2);
  }
  }

  static void
 -__gcov_scale_dc (gcov_type *counters, unsigned n_counters, double f)
 +__gcov_dc_op (gcov_type *counters, unsigned n_counters,
 +  counter_op_fn fn, void *data1, void *data2)
  {
unsigned i;

gcc_assert (!(n_counters % 2));
for (i = 0; i  n_counters; i += 2)
 -counters[1] *= f;

 Same question here

 +counters[i + 1] = fn (counters[i + 1], data1, data2);
  }

 Teresa

 On Tue, Feb 4, 2014 at 3:54 PM, Rong Xu x...@google.com wrote:
 Hi,

 The attached patch uses a callback function to unify the integer and
 floating-point scaling in gcov-tool. (Also fix a bug in fp scaling of
 ic and dc counters in earlier code).

 Tested with spec2006 profiles.

 OK for checking in?

 Thanks,

 -Rong



 --
 Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413


Re: [google gcc-4_8] gcov-tool: some new LIPO supports.

2014-02-04 Thread Rong Xu
Here is the revised patch that integrates Teresa' comments.
Ok for checking in?

-Rong

On Thu, Jan 30, 2014 at 1:20 PM, Rong Xu x...@google.com wrote:
 Comments are inlined. New patch attached to this email.

 -Rong

 On Thu, Jan 30, 2014 at 12:39 PM, Xinliang David Li davi...@google.com 
 wrote:
 On Wed, Jan 29, 2014 at 4:24 PM, Rong Xu x...@google.com wrote:
 Hi,

 The attached patch adds some new features to gcov-tool. It aims to
 rewrite LIPO profile for reuse, debug or performance tuning purpose.

 -l, --modu_list file  Only use the modules in this file

 I think in verbose mode, the tool should emit warnings when a module
 is trimmed due to this option. This can be used in regression test.


 In previous patch, this warning is emitted unconventionally (not
 guarded by verbose flag).
 I changed it to under verbose mode in this patch.

 -r, --string Replace string in path

 --path_substr_replace or something in that line.


 Done.

 -u, --use_imports_file Use the grouping in import files.


 Is there a path in code to auto test this?

 As we discussed offline. This can be verified by a separated script.


 -l uses only the modules in specified file to compute module grouping
 (and subsequent dumping).
 -r replaces the pattern specified in the argument. The format is:
 old_str1:new_str1[,old_str2:new_str2]*, only the first occurrence is
 replaced.
 -u skips the run-time module grouping computation and reuses the one
 comes with the profiles (which is user editable).

 Tested with profiles from google internal benchmarks.


 Also use strcasestr for case insenstive operation.

 Done.


 David

 -Rong
Index: gcc/gcov-tool.c
===
--- gcc/gcov-tool.c (revision 207435)
+++ gcc/gcov-tool.c (working copy)
@@ -29,6 +29,7 @@ see the files COPYING3 and COPYING.RUNTIME respect
 #include coretypes.h
 #include tm.h
 #include intl.h
+#include hashtab.h
 #include diagnostic.h
 #include version.h
 #include gcov-io.h
@@ -38,6 +39,7 @@ see the files COPYING3 and COPYING.RUNTIME respect
 #include ftw.h
 #include getopt.h
 #include params.h
+#include string.h
 
 extern int gcov_profile_merge (struct gcov_info*, struct gcov_info*, int, int);
 extern int gcov_profile_normalize (struct gcov_info*, gcov_type);
@@ -47,9 +49,11 @@ extern struct gcov_info* gcov_read_profile_dir (co
 extern void gcov_exit (void);
 extern void set_gcov_list (struct gcov_info *);
 extern void gcov_set_verbose (void);
+extern void set_use_existing_grouping (void);
+extern void set_use_modu_list (void);
+extern void lipo_set_substitute_string (const char *);
 
-static int verbose;
-
+/* The following defines are needed by dyn-ipa.c.  */
 gcov_unsigned_t __gcov_lipo_grouping_algorithm;
 gcov_unsigned_t __gcov_lipo_merge_modu_edges;
 gcov_unsigned_t __gcov_lipo_weak_inclusion;
@@ -60,6 +64,8 @@ gcov_unsigned_t __gcov_lipo_random_seed;
 gcov_unsigned_t __gcov_lipo_dump_cgraph;
 gcov_unsigned_t __gcov_lipo_propagate_scale;
 
+static int verbose;
+
 /* Remove file NAME if it has a gcda suffix. */
 
 static int
@@ -285,6 +291,189 @@ profile_rewrite (const char *d1, const char *out,
   return 0;
 }
 
+/* This is the hashtab entry to store a name and mod_id pair. */
+typedef struct {
+  const char *name;
+  unsigned id;
+} mod_name_id;
+
+/* Hash and comparison functions for strings.  */
+
+static unsigned
+mod_name_id_htab_hash (const void *s_p)
+{
+  const char *s = ((const mod_name_id *) s_p)-name;
+  return (*htab_hash_string) (s);
+}
+
+static int
+mod_name_id_hash_eq (const void *s1_p, const void *s2_p)
+{
+  return strcmp (((const mod_name_id *) s1_p)-name,
+ ((const mod_name_id *) s2_p)-name) == 0;
+}
+
+static htab_t mod_name_id_hash_table;
+
+/* Look up an entry in the hash table. STRING is the module name.
+   CREATE controls to insert to htab or not.
+   If (*ID_P != 0), we write (*ID_P) to htab.
+   If (*ID_P == 0), we write module_id to (*ID_P).
+   return 1 if an entry is found and otherwise 0.  */
+
+static int
+module_name_hash_lookup (const char *string, unsigned *id_p, int create)
+{
+  void **e;
+  mod_name_id t;
+
+  t.name = string;
+  e = htab_find_slot (mod_name_id_hash_table, t,
+  create ? INSERT : NO_INSERT);
+  if (e == NULL)
+return 0;
+  if (*e == NULL)
+{
+  *e = XNEW (mod_name_id *);
+  (*(mod_name_id **)e)-name = xstrdup (string);
+}
+  if (id_p)
+{
+  if (*id_p != 0)
+(*(mod_name_id **)e)-id = *id_p;
+  else
+*id_p = (*(mod_name_id **)e)-id;
+}
+  return 1;
+}
+
+/* Return 1 if NAME is of a source type that LIPO targets.
+   Return 0 otherwise.  */
+
+static int
+is_lipo_source_type (char *name)
+{
+  char *p;
+
+  if (strcasestr (name, .c) ||
+  strcasestr (name, .cc) ||
+  strcasestr (name, .cpp) ||
+  strcasestr (name, .c++))
+return 1;
+
+  /* Replace .proto with .pb.cc. Since the two strings have

[google gcc-4_8] fix profiledbootstrap

2014-02-04 Thread Rong Xu
Hi,

The attached patch fixes the duplicated definition error in
gcov-tool.c in profiledbootstrap.

Google branch only and tested with profiledbootstrap.

Ok for checking in?

-Rong
Index: gcov-tool.c
===
--- gcov-tool.c (revision 207435)
+++ gcov-tool.c (working copy)
@@ -50,16 +50,23 @@
 
 static int verbose;
 
-gcov_unsigned_t __gcov_lipo_grouping_algorithm;
-gcov_unsigned_t __gcov_lipo_merge_modu_edges;
-gcov_unsigned_t __gcov_lipo_weak_inclusion;
-gcov_unsigned_t __gcov_lipo_max_mem;
-gcov_unsigned_t __gcov_lipo_random_group_size;
-gcov_unsigned_t __gcov_lipo_cutoff;
-gcov_unsigned_t __gcov_lipo_random_seed;
-gcov_unsigned_t __gcov_lipo_dump_cgraph;
-gcov_unsigned_t __gcov_lipo_propagate_scale;
+/* The following defines are needed by dyn-ipa.c.
+   They will also be emitted by the compiler with -fprofile-generate,
+   which means this file cannot be compiled with -fprofile-generate
+   -- otherwise we get duplicated defintions.
+   Make the defines weak to link with other objects/libraries
+   that potentially compiled with -fprofile-generate.  */
 
+__attribute__ ((weak)) gcov_unsigned_t __gcov_lipo_grouping_algorithm;
+__attribute__ ((weak)) gcov_unsigned_t __gcov_lipo_merge_modu_edges;
+__attribute__ ((weak)) gcov_unsigned_t __gcov_lipo_weak_inclusion;
+__attribute__ ((weak)) gcov_unsigned_t __gcov_lipo_max_mem;
+__attribute__ ((weak)) gcov_unsigned_t __gcov_lipo_random_group_size;
+__attribute__ ((weak)) gcov_unsigned_t __gcov_lipo_cutoff;
+__attribute__ ((weak)) gcov_unsigned_t __gcov_lipo_random_seed;
+__attribute__ ((weak)) gcov_unsigned_t __gcov_lipo_dump_cgraph;
+__attribute__ ((weak)) gcov_unsigned_t __gcov_lipo_propagate_scale;
+
 /* Remove file NAME if it has a gcda suffix. */
 
 static int
Index: Makefile.in
===
--- Makefile.in (revision 207435)
+++ Makefile.in (working copy)
@@ -4090,6 +4090,9 @@
 GCOV_TOOL_OBJS = gcov-tool.o libgcov-util.o dyn-ipa.o gcov-params.o
 gcov-tool.o: gcov-tool.c $(GCOV_IO_H) intl.h $(SYSTEM_H) coretypes.h $(TM_H) \
$(CONFIG_H) version.h $(DIAGNOSTIC_H)
+   +$(COMPILER) -c $(ALL_CPPFLAGS) \
+ $(filter-out -fprofile-generate -fprofile-use,$(ALL_COMPILERFLAGS)) \
+ $(INCLUDES) -o $@ $
 gcov-params.o: params.c $(PARAMS_H)
+$(COMPILER) -DIN_GCOV_TOOL -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) \
  $(INCLUDES) -o $@ $


[google gcc-4_8] unify int and fp scaling in gcov-tool

2014-02-04 Thread Rong Xu
Hi,

The attached patch uses a callback function to unify the integer and
floating-point scaling in gcov-tool. (Also fix a bug in fp scaling of
ic and dc counters in earlier code).

Tested with spec2006 profiles.

OK for checking in?

Thanks,

-Rong
2014-02-04  Rong Xu  x...@google.com

* gcc/gcov-tool.c (profile_rewrite2): Remove.
(profile_rewrite): Merge int and fp scaling.
(do_rewrite): Ditto.
* libgcc/libgcov-util.c (typedef gcov_type) New.
(__gcov_add_counter_op): Merge int and fp scaling.
(__gcov_ior_counter_op): Ditto.
(__gcov_delta_counter_op): Ditto.
(__gcov_single_counter_op): Ditto.
(__gcov_icall_topn_op): Ditto.
(__gcov_dc_op): Ditto.
(fp_scale): Ditto.
(int_scale): Ditto.
(gcov_profile_scale): Ditto.
(gcov_profile_normalize): Ditto.
(__gcov_scale2_add): Remove.
(__gcov_scale2_ior): Remove.
(__gcov_scale2_delta): Remove.
(__gcov_scale2_single): Remove.
(__gcov_scale2_icall_topn): Remove.
(__gcov_scale2_dc): Remove.
(gcov_profile_scale2): Remove.

Index: gcc/gcov-tool.c
===
--- gcc/gcov-tool.c (revision 207488)
+++ gcc/gcov-tool.c (working copy)
@@ -43,8 +43,7 @@ see the files COPYING3 and COPYING.RUNTIME respect
 
 extern int gcov_profile_merge (struct gcov_info*, struct gcov_info*, int, int);
 extern int gcov_profile_normalize (struct gcov_info*, gcov_type);
-extern int gcov_profile_scale (struct gcov_info*, float);
-extern int gcov_profile_scale2 (struct gcov_info*, int, int);
+extern int gcov_profile_scale (struct gcov_info*, float, int, int);
 extern struct gcov_info* gcov_read_profile_dir (const char*, int);
 extern void gcov_exit (void);
 extern void set_gcov_list (struct gcov_info *);
@@ -227,45 +226,13 @@ do_merge (int argc, char **argv)
   return ret;
 }
 
-/* Scale the counters in profile DIR by a factor of N/D.
-   Result is written to dir OUT. Return 0 on success.  */
-
-static int
-profile_rewrite2 (const char *dir, const char *out, int n, int d)
-{
-  char *pwd;
-  int ret;
-  struct gcov_info * profile;
-
-
-  profile = gcov_read_profile_dir (dir, 0);
-  if (!profile)
-return 1;
-
-  /* Output new profile.  */
-  unlink_profile_dir (out);
-  mkdir (out, 0755);
-  pwd = getcwd (NULL, 0);
-  gcc_assert (pwd);
-  ret = chdir (out);
-  gcc_assert (ret == 0);
-
-  gcov_profile_scale2 (profile, n, d);
-
-  set_gcov_list (profile);
-  gcov_exit ();
-
-  ret = chdir (pwd);
-  free (pwd);
-  return 0;
-}
-
 /* If N_VAL is no-zero, normalize the profile by setting the largest counter
counter value to N_VAL and scale others counters proportionally.
Otherwise, multiply the all counters by SCALE.  */
 
 static int
-profile_rewrite (const char *d1, const char *out, long long n_val, float scale)
+profile_rewrite (const char *d1, const char *out, long long n_val,
+ float scale, int n, int d)
 {
   char *pwd;
   int ret;
@@ -287,7 +254,7 @@ static int
   if (n_val)
 gcov_profile_normalize (d1_profile, (gcov_type) n_val);
   else
-gcov_profile_scale (d1_profile, scale);
+gcov_profile_scale (d1_profile, scale, n, d);
 
   set_gcov_list (d1_profile);
   gcov_exit ();
@@ -604,9 +571,9 @@ do_rewrite (int argc, char **argv)
   if (argc - optind == 1)
 {
   if (denominator  0)
-ret = profile_rewrite2 (argv[optind],  output_dir, numerator, 
denominator);
+ret = profile_rewrite (argv[optind],  output_dir, 0, 0.0, numerator, 
denominator);
   else
-ret = profile_rewrite (argv[optind],  output_dir, normalize_val, 
scale);
+ret = profile_rewrite (argv[optind],  output_dir, normalize_val, 
scale, 0, 0);
 }
   else
 rewrite_usage ();
Index: libgcc/libgcov-util.c
===
--- libgcc/libgcov-util.c   (revision 207488)
+++ libgcc/libgcov-util.c   (working copy)
@@ -873,39 +873,38 @@ gcov_profile_merge (struct gcov_info *tgt_profile,
   return 0;
 }
 
-/* This part of code is to scale profile counters.  */
+typedef gcov_type (*counter_op_fn) (gcov_type, void*, void*);
 
-/* Type of function used to normalize counters.  */
-typedef void (*gcov_scale_fn) (gcov_type *, gcov_unsigned_t, double);
+/* Performing FN upon arc counters.  */
 
-/* Scale arc counters. N_COUNTERS of counter value in COUNTERS array are
-   multiplied by a factor F.  */
-
 static void
-__gcov_scale_add (gcov_type *counters, unsigned n_counters, double f)
+__gcov_add_counter_op (gcov_type *counters, unsigned n_counters,
+   counter_op_fn fn, void *data1, void *data2)
 {
   for (; n_counters; counters++, n_counters--)
 {
   gcov_type val = *counters;
-  *counters = val * f;
+  *counters = fn(val, data1, data2);
 }
 }
 
-/* Scale ior counters.  */
+/* Performing FN upon ior counters.  */
 
 static void

Re: [google gcc-4_8] gcov-tools: minor fix for broken build for arm

2014-01-31 Thread Rong Xu
Thanks for catching this, and the fix.
OK for google branch.

-Rong

On Fri, Jan 31, 2014 at 2:46 PM, Hán Shěn (沈涵) shen...@google.com wrote:
 Hi Rong, while building for arm toolchain on chromeos, GCOV_LOCKED is
 not defined, which leads to redefinition of cs_all, this is observed
 on google/gcc-4_8 branch.

 Patch below, tested on chromeos for arm and x86_64 arch.

 Ok for google/gcc-4_8 branch?

 diff --git a/libgcc/libgcov-driver.c b/libgcc/libgcov-driver.c
 index 76f6104..bbce3a6 100644
 --- a/libgcc/libgcov-driver.c
 +++ b/libgcc/libgcov-driver.c
 @@ -559,10 +559,6 @@ gcov_exit_merge_summary (const struct gcov_info
 *gi_ptr, struct gcov_summary *pr
  {
struct gcov_ctr_summary *cs_prg, *cs_tprg, *cs_all;
unsigned t_ix;
 -#if !GCOV_LOCKED
 -  /* summary for all instances of program.  */
 -  struct gcov_ctr_summary *cs_all;
 -#endif

/* Merge the summaries.  */
for (t_ix = 0; t_ix  GCOV_COUNTERS_SUMMABLE; t_ix++)


 Han


Re: [google gcc-4_8] gcov-tool: some new LIPO supports.

2014-01-30 Thread Rong Xu
Comments are inlined. New patch attached to this email.

-Rong

On Thu, Jan 30, 2014 at 12:39 PM, Xinliang David Li davi...@google.com wrote:
 On Wed, Jan 29, 2014 at 4:24 PM, Rong Xu x...@google.com wrote:
 Hi,

 The attached patch adds some new features to gcov-tool. It aims to
 rewrite LIPO profile for reuse, debug or performance tuning purpose.

 -l, --modu_list file  Only use the modules in this file

 I think in verbose mode, the tool should emit warnings when a module
 is trimmed due to this option. This can be used in regression test.


In previous patch, this warning is emitted unconventionally (not
guarded by verbose flag).
I changed it to under verbose mode in this patch.

 -r, --string Replace string in path

 --path_substr_replace or something in that line.


Done.

 -u, --use_imports_file Use the grouping in import files.


 Is there a path in code to auto test this?

As we discussed offline. This can be verified by a separated script.


 -l uses only the modules in specified file to compute module grouping
 (and subsequent dumping).
 -r replaces the pattern specified in the argument. The format is:
 old_str1:new_str1[,old_str2:new_str2]*, only the first occurrence is
 replaced.
 -u skips the run-time module grouping computation and reuses the one
 comes with the profiles (which is user editable).

 Tested with profiles from google internal benchmarks.


 Also use strcasestr for case insenstive operation.

Done.


 David

 -Rong
Index: gcc/gcov-tool.c
===
--- gcc/gcov-tool.c (revision 207316)
+++ gcc/gcov-tool.c (working copy)
@@ -29,6 +29,7 @@ see the files COPYING3 and COPYING.RUNTIME respect
 #include coretypes.h
 #include tm.h
 #include intl.h
+#include hashtab.h
 #include diagnostic.h
 #include version.h
 #include gcov-io.h
@@ -38,6 +39,7 @@ see the files COPYING3 and COPYING.RUNTIME respect
 #include ftw.h
 #include getopt.h
 #include params.h
+#include string.h
 
 extern int gcov_profile_merge (struct gcov_info*, struct gcov_info*, int, int);
 extern int gcov_profile_normalize (struct gcov_info*, gcov_type);
@@ -47,9 +49,11 @@ extern struct gcov_info* gcov_read_profile_dir (co
 extern void gcov_exit (void);
 extern void set_gcov_list (struct gcov_info *);
 extern void gcov_set_verbose (void);
+extern void set_use_existing_grouping (void);
+extern void set_use_modu_list (void);
+extern void lipo_set_substitute_string (const char *);
 
-static int verbose;
-
+/* The following defines are needed by dyn-ipa.c.  */
 gcov_unsigned_t __gcov_lipo_grouping_algorithm;
 gcov_unsigned_t __gcov_lipo_merge_modu_edges;
 gcov_unsigned_t __gcov_lipo_weak_inclusion;
@@ -60,6 +64,8 @@ gcov_unsigned_t __gcov_lipo_random_seed;
 gcov_unsigned_t __gcov_lipo_dump_cgraph;
 gcov_unsigned_t __gcov_lipo_propagate_scale;
 
+static int verbose;
+
 /* Remove file NAME if it has a gcda suffix. */
 
 static int
@@ -285,6 +291,187 @@ profile_rewrite (const char *d1, const char *out,
   return 0;
 }
 
+/* This is the hashtab entry to store a name and mod_id pair. */
+typedef struct {
+  const char *name;
+  unsigned id;
+} mod_name_id;
+
+/* Hash and comparison functions for strings.  */
+
+static unsigned
+mod_name_id_htab_hash (const void *s_p)
+{
+  const char *s = ((const mod_name_id *) s_p)-name;
+  return (*htab_hash_string) (s);
+}
+
+static int
+mod_name_id_hash_eq (const void *s1_p, const void *s2_p)
+{
+  return strcmp (((const mod_name_id *) s1_p)-name,
+ ((const mod_name_id *) s2_p)-name) == 0;
+}
+
+static htab_t mod_name_id_hash_table;
+
+/* Look up an entry in the hash table. STRING is the module name.
+   CREATE controls to insert to htab or not.
+   If (*ID_P != 0), we write (*ID_P) to htab.
+   If (*ID_P == 0), we write module_id to (*ID_P).
+   return 1 if an entry is found and otherwise 0.  */
+
+static int
+module_name_hash_lookup (const char *string, unsigned *id_p, int create)
+{
+  void **e;
+  mod_name_id t;
+
+  t.name = string;
+  e = htab_find_slot (mod_name_id_hash_table, t,
+  create ? INSERT : NO_INSERT);
+  if (e == NULL)
+return 0;
+  if (*e == NULL)
+{
+  *e = XNEW (mod_name_id *);
+  (*(mod_name_id **)e)-name = xstrdup (string);
+}
+  if (id_p)
+{
+  if (*id_p != 0)
+(*(mod_name_id **)e)-id = *id_p;
+  else
+*id_p = (*(mod_name_id **)e)-id;
+}
+  return 1;
+}
+
+/* Return 1 if NAME is of a source type that LIPO targets.
+   Return 0 otherwise.  */
+
+static int
+is_lipo_source_type (char *name)
+{
+  char *p;
+
+  if (strcasestr (name, .c) ||
+  strcasestr (name, .cc) ||
+  strcasestr (name, .cpp) ||
+  strcasestr (name, .c++))
+return 1;
+
+  if ((p = strcasestr (name, .proto)) != NULL)
+{
+  strcpy (p, .pb.cc);
+  return 1;
+}
+
+  return 0;
+}
+
+/* Convert/process the names from dependence query to a
+   stardard

[google gcc-4_8] gcov-tool: some new LIPO supports.

2014-01-29 Thread Rong Xu
Hi,

The attached patch adds some new features to gcov-tool. It aims to
rewrite LIPO profile for reuse, debug or performance tuning purpose.

-l, --modu_list file  Only use the modules in this file
-r, --string Replace string in path
-u, --use_imports_file Use the grouping in import files.

-l uses only the modules in specified file to compute module grouping
(and subsequent dumping).
-r replaces the pattern specified in the argument. The format is:
old_str1:new_str1[,old_str2:new_str2]*, only the first occurrence is
replaced.
-u skips the run-time module grouping computation and reuses the one
comes with the profiles (which is user editable).

Tested with profiles from google internal benchmarks.

-Rong
2014-01-29  Rong Xu  x...@google.com

* gcc/gcov-tool.c (mod_name_id): New type of the hashtable entry.
(mod_name_id_htab_hash): Hash function for module-name and module-id
hashtab.
(mod_name_id_hash_eq): Comparison function for the hashtab.
(module_name_hash_lookup): Lookup function of the hashtab.
(is_lipo_source_type): Find if a file is of LIPO target source.
(lipo_process_name_string): Support of -l.
(lipo_process_modu_list): Ditto.
(is_module_available): Support of -u.
(get_module_id_from_name): Ditto.
(print_rewrite_usage_message): Add -u -l and -r support.
(do_rewrite): Ditto.
(print_version): Ditto.
(process_args): Ditto.
* libgcc/dyn-ipa.c (flag_use_existing_grouping): New flag for -u.
(init_dyn_call_graph): Support of -l.
(__gcov_finalize_dyn_callgraph): Ditto.
(gcov_build_callgraph): Ditto.
(gcov_compute_cutoff_count): Ditto.
(gcov_process_cgraph_node): Ditto.
(build_modu_graph): Ditto.
(gcov_compute_module_groups_eager_propagation): Ditto.
(gcov_compute_random_module_groups): Ditto.
(gcov_write_module_infos): Ditto.
(gcov_dump_callgraph): Ditto.
(set_use_existing_grouping): Support of -u.
(open_imports_file): Ditto.
(read_modu_groups_from_imports_files): Ditto.
(__gcov_compute_module_groups): Ditto.
* libgcc/libgcov-util.c (substitute_string): Support -r.
(lipo_set_substitute_string): Ditto.
(lipo_process_substitute_string_1): Ditto.
(lipo_process_substitute_string): Ditto.
(read_gcda_file): Ditto.
(flag_use_modu_list): Support of -l
(set_use_modu_list): Ditto.
(ftw_read_file): Ditto.
(source_profile_dir): Support of -u.
(get_source_profile_dir): Ditto.
(gcov_read_profile_dir): Ditto.

Index: gcc/gcov-tool.c
===
--- gcc/gcov-tool.c (revision 207055)
+++ gcc/gcov-tool.c (working copy)
@@ -29,6 +29,7 @@ see the files COPYING3 and COPYING.RUNTIME respect
 #include coretypes.h
 #include tm.h
 #include intl.h
+#include hashtab.h
 #include diagnostic.h
 #include version.h
 #include gcov-io.h
@@ -38,6 +39,7 @@ see the files COPYING3 and COPYING.RUNTIME respect
 #include ftw.h
 #include getopt.h
 #include params.h
+#include string.h
 
 extern int gcov_profile_merge (struct gcov_info*, struct gcov_info*, int, int);
 extern int gcov_profile_normalize (struct gcov_info*, gcov_type);
@@ -47,9 +49,11 @@ extern struct gcov_info* gcov_read_profile_dir (co
 extern void gcov_exit (void);
 extern void set_gcov_list (struct gcov_info *);
 extern void gcov_set_verbose (void);
+extern void set_use_existing_grouping (void);
+extern void set_use_modu_list (void);
+extern void lipo_set_substitute_string (const char *);
 
-static int verbose;
-
+/* The following defines are needed by dyn-ipa.c.  */
 gcov_unsigned_t __gcov_lipo_grouping_algorithm;
 gcov_unsigned_t __gcov_lipo_merge_modu_edges;
 gcov_unsigned_t __gcov_lipo_weak_inclusion;
@@ -60,6 +64,8 @@ gcov_unsigned_t __gcov_lipo_random_seed;
 gcov_unsigned_t __gcov_lipo_dump_cgraph;
 gcov_unsigned_t __gcov_lipo_propagate_scale;
 
+static int verbose;
+
 /* Remove file NAME if it has a gcda suffix. */
 
 static int
@@ -285,6 +291,189 @@ profile_rewrite (const char *d1, const char *out,
   return 0;
 }
 
+/* This is the hashtab entry to store a name and mod_id pair. */
+typedef struct {
+  const char *name;
+  unsigned id;
+} mod_name_id;
+
+/* Hash and comparison functions for strings.  */
+
+static unsigned
+mod_name_id_htab_hash (const void *s_p)
+{
+  const char *s = ((const mod_name_id *) s_p)-name;
+  return (*htab_hash_string) (s);
+}
+
+static int
+mod_name_id_hash_eq (const void *s1_p, const void *s2_p)
+{
+  return strcmp (((const mod_name_id *) s1_p)-name,
+ ((const mod_name_id *) s2_p)-name) == 0;
+}
+
+static htab_t mod_name_id_hash_table;
+
+/* Look up an entry in the hash table. STRING is the module name.
+   CREATE controls to insert to htab or not.
+   If (*ID_P != 0), we write (*ID_P) to htab

[Google gcc-4_8] always emit __gcov_get_profile_prefix when linking libgcov.a

2014-01-23 Thread Rong Xu
Hi,

This patch is for google/gcc-4_8 branch. It fixes a regression in
earlier libgcov refactoring.

Thanks,

-Rong

2014-01-23  Rong Xu  x...@google.com

* libgcov-driver.c (__gcov_get_profile_prefix): Always emit
this function.

Index: libgcov-driver.c
===
--- libgcov-driver.c(revision 206736)
+++ libgcov-driver.c(working copy)
@@ -68,6 +68,8 @@ extern struct gcov_info *get_gcov_list (void) ATTR
these symbols will always need to be resolved.  */
 void (*__gcov_dummy_ref1)(void) = __gcov_reset;
 void (*__gcov_dummy_ref2)(void) = __gcov_dump;
+extern char *__gcov_get_profile_prefix (void);
+char *(*__gcov_dummy_ref3)(void) = __gcov_get_profile_prefix;

 /* Default callback function for profile instrumentation callback.  */
 extern void __coverage_callback (gcov_type, int);


Re: [google gcc-4_8] port gcov-tool to gcc-4_8

2014-01-17 Thread Rong Xu
Do we have to split params.def?

I can include params.h and link in params.o (a special version as we
don't have some global vars).

As for lipo_cutoff, I think we don't need a special handling -- we
should use the default value of 100 and let logic in dyn-ipa.c takes
care of the rest.

Please find the update patch which reads the default values from params.def.

-Rong

On Fri, Jan 17, 2014 at 12:05 PM, Xinliang David Li davi...@google.com wrote:
 For LIPO parameters, you can do this
 1) isolate all LIPO specific parameters into lipo_params.def, and
 include it from params.def.
 2) include lipo_params.def (after proper definition of DEFPARAM) in
 the profile tool source dir.

 By so doing, the compiler setting and profile tool will be in sync.
 (Unfortuately, for historic reason, the lipo_cutoff default val is
 still set in dyn-ipa.c even with the parameter, so some comments needs
 to be added in dyn-ipa.c to make sure the value should be doubly
 updated).

 David

 On Fri, Jan 17, 2014 at 11:15 AM, Rong Xu x...@google.com wrote:
 Hi,

 This patch port the gcov-tool work to google/gcc-4_8 branches.

 Tested with spec2006, profiledbootstrap and google internal benchmarks.

 -Rong
Index: Makefile.in
===
--- Makefile.in (revision 206671)
+++ Makefile.in (working copy)
@@ -123,7 +123,8 @@ SUBDIRS =@subdirs@ build
 
 # Selection of languages to be made.
 CONFIG_LANGUAGES = @all_selected_languages@
-LANGUAGES = c gcov$(exeext) gcov-dump$(exeext) $(CONFIG_LANGUAGES)
+LANGUAGES = c gcov$(exeext) gcov-dump$(exeext) gcov-tool$(exeext) \
+$(CONFIG_LANGUAGES)
 
 # Default values for variables overridden in Makefile fragments.
 # CFLAGS is for the user to override to, e.g., do a cross build with -O2.
@@ -194,6 +195,7 @@ GCC_WARN_CXXFLAGS = $(LOOSE_WARN) $($(@D)-warn) $(
 build/gengtype-lex.o-warn = -Wno-error
 gengtype-lex.o-warn = -Wno-error
 expmed.o-warn = -Wno-error
+libgcov-util.o-warn = -Wno-error
 
 # All warnings have to be shut off in stage1 if the compiler used then
 # isn't gcc; configure determines that.  WARN_CFLAGS will be either
@@ -755,6 +757,7 @@ GCC_TARGET_INSTALL_NAME := $(target_noncanonical)-
 CPP_INSTALL_NAME := $(shell echo cpp|sed '$(program_transform_name)')
 GCOV_INSTALL_NAME := $(shell echo gcov|sed '$(program_transform_name)')
 PROFILE_TOOL_INSTALL_NAME := $(shell echo profile_tool|sed 
'$(program_transform_name)')
+GCOV_TOOL_INSTALL_NAME := $(shell echo gcov-tool|sed 
'$(program_transform_name)')
 
 # Setup the testing framework, if you have one
 EXPECT = `if [ -f $${rootme}/../expect/expect ] ; then \
@@ -1480,7 +1483,8 @@ ALL_HOST_FRONTEND_OBJS = $(foreach v,$(CONFIG_LANG
 
 ALL_HOST_BACKEND_OBJS = $(GCC_OBJS) $(OBJS) $(OBJS-libcommon) \
   $(OBJS-libcommon-target) @TREEBROWSER@ main.o c-family/cppspec.o \
-  $(COLLECT2_OBJS) $(EXTRA_GCC_OBJS) $(GCOV_OBJS) $(GCOV_DUMP_OBJS)
+  $(COLLECT2_OBJS) $(EXTRA_GCC_OBJS) $(GCOV_OBJS) $(GCOV_DUMP_OBJS) \
+  $(GCOV_TOOL_OBJS)
 
 # This lists all host object files, whether they are included in this
 # compilation or not.
@@ -1505,6 +1509,7 @@ MOSTLYCLEANFILES = insn-flags.h insn-config.h insn
  $(SPECS) collect2$(exeext) gcc-ar$(exeext) gcc-nm$(exeext) \
  gcc-ranlib$(exeext) \
  gcov-iov$(build_exeext) gcov$(exeext) gcov-dump$(exeext) \
+ gcov-tool$(exeect) \
  gengtype$(exeext) *.[0-9][0-9].* *.[si] *-checksum.c libbackend.a \
  libcommon-target.a libcommon.a libgcc.mk
 
@@ -4070,6 +4075,24 @@ GCOV_DUMP_OBJS = gcov-dump.o vec.o ggc-none.o
 gcov-dump$(exeext): $(GCOV_DUMP_OBJS) $(LIBDEPS)
+$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) $(GCOV_DUMP_OBJS) \
$(LIBS) -o $@
+
+libgcov-util.o: $(srcdir)/../libgcc/libgcov-util.c gcov-io.c $(GCOV_IO_H) \
+  $(srcdir)/../libgcc/libgcov-driver.c 
$(srcdir)/../libgcc/libgcov-driver-system.c \
+  $(srcdir)/../libgcc/libgcov-merge.c $(srcdir)/../libgcc/libgcov.h \
+  $(SYSTEM_H) coretypes.h $(TM_H) $(CONFIG_H) version.h intl.h $(DIAGNOSTIC_H)
+   +$(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) -o $@ 
$
+dyn-ipa.o: $(srcdir)/../libgcc/dyn-ipa.c gcov-io.c 
$(srcdir)/../libgcc/libgcov.h \
+   $(GCOV_IO_H) $(SYSTEM_H) coretypes.h \
+   $(TM_H) $(CONFIG_H) version.h intl.h $(DIAGNOSTIC_H)
+   +$(COMPILER) -DIN_GCOV_TOOL -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) \
+ $(INCLUDES) -o $@ $
+
+GCOV_TOOL_OBJS = gcov-tool.o libgcov-util.o dyn-ipa.o params.o
+gcov-tool.o: gcov-tool.c $(GCOV_IO_H) intl.h $(SYSTEM_H) coretypes.h $(TM_H) \
+   $(CONFIG_H) version.h $(DIAGNOSTIC_H)
+gcov-tool$(exeext): $(GCOV_TOOL_OBJS) $(LIBDEPS)
+   +$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) $(GCOV_TOOL_OBJS) \
+ $(LIBS) -o $@
 #
 # Build the include directories.  The stamp files are stmp-* rather than
 # s-* so that mostlyclean does not force the include directory to
@@ -4697,6 +4720,13 @@ install-common: native lang.install-common install
$(INSTALL_PROGRAM) $(srcdir)/../contrib/profile_tool

Re: [PATCH] offline gcda profile processing tool

2014-01-16 Thread Rong Xu
Ping.

-Rong

On Mon, Jan 13, 2014 at 12:43 PM, Rong Xu x...@google.com wrote:
 Hi,

 This patch implements gcov-tool, a offline profile processing tool.
 This version supports merging two profiles with weights, and scaling
 the profile with a floating-point / fraction weight.

 Earlier discussion can be found
 http://gcc.gnu.org/ml/gcc-patches/2013-11/msg02631.html

 In the discussion of the earlier patch, Honza wanted a separated
 patch for interface change for gcov_merge_*() functions. In this patch,
 the changes for gcov_merge_*() are minimal by using a wrapper function.
 So I include everything in this patch.

 Tested with bootstrap profiledboostrap and SPEC2006 profiles.

 Thanks,

 -Rong


Re: [google gcc-4_8] backport libgcov re-factoring patches from trunk

2014-01-15 Thread Rong Xu
On Wed, Jan 15, 2014 at 10:40 AM, Xinliang David Li davi...@google.com wrote:
 In libgcov-driver.c,

 1) there are couple of places with trailing white spaces (e.g, in
 gcov_sort_n_vals body), please remove
They are from the existing code. But I'll fix them.

 2) gcov_exit_write_gcda in trunk takes eof_pos as an arg, and check it
 before writing the header. I think this is more correct than in your
 patch
That's true. But this logic is from the newer code in trunk.
Current 4_8 code does not check this. I deliberately did this because I thought
the backporting patch should not change this.

I'll change to the trunk verison then.


 3) It would be better to keep the function order the same in trunk
 (e.g, compute summary related, merge gcda and write gcda etc); it is
 also helpful to keep LIPO related functions order in the same way as
 in google/main so that a better diff can be done
 4) libdriver-profiler.c -- make the function ordering the same as in
 google/main would be helpful.

sure. I'll do this two items. Will send an updated patch soon.


 thanks,

 David

 On Wed, Jan 15, 2014 at 10:03 AM, Rong Xu x...@google.com wrote:
 The attached patch backports libgcov re-factoring patches from trunk.

 Tested with google internal benchmarks, SPEC2006, bootstrap and
 profiledbootstrap.

 OK for google/gcc-4_8 branch?

 -Rong


Re: [RFC] libgcov.c re-factoring and offline profile-tool

2014-01-09 Thread Rong Xu
My bad.
Thanks for the fix!

-Rong

On Thu, Jan 9, 2014 at 11:47 AM, H.J. Lu hjl.to...@gmail.com wrote:
 On Wed, Jan 8, 2014 at 2:33 PM, Rong Xu x...@google.com wrote:
 On Fri, Dec 6, 2013 at 6:23 AM, Jan Hubicka hubi...@ucw.cz wrote:
 @@ -325,6 +311,9 @@ static struct gcov_summary all_prg;
  #endif
  /* crc32 for this program.  */
  static gcov_unsigned_t crc32;
 +/* Use this summary checksum rather the computed one if the value is
 + *non-zero.  */
 +static gcov_unsigned_t saved_summary_checksum;

 Why do you need to save the checksum? Won't it reset summary back with 
 multiple streaming?

 This was for the gcov_tool. checksum will be recomputed in gcov_exit
 and the value will depend on
 the order of gcov_info list. (the order will be different after
 reading from gcda files to memory). The purpose was
 to have the same summary_checksum so that I can get identical gcov-dump 
 output.


 I would really like to avoid introducing those static vars that are used 
 exclusively
 by gcov_exit.  What about putting them into an gcov_context structure that
 is passed around the functions that was broken out?

 With my recently patch the localizes this_prg, we only use 64 more
 bytes in bss. Do you still we have to remove
 all these statics?



 libgcc ChangeLog entries should be in libgcc/ChangeLog,
 not gcc/ChangeLog.  I checked in a patch to move them
 to libgcc/ChangeLog.

 --
 H.J.


Re: [RFC] libgcov.c re-factoring and offline profile-tool

2014-01-08 Thread Rong Xu
 _gcov_pow2_profiler.o (ex 
 ./lib64/gcc/x86_64-suse-linux/4.8/libgcov.a)
 115   0   0 115  73 _gcov_one_value_profiler.o (ex 
 ./lib64/gcc/x86_64-suse-linux/4.8/libgcov.a)
 122   0   0 122  7a _gcov_indirect_call_profiler.o (ex 
 ./lib64/gcc/x86_64-suse-linux/4.8/libgcov.a)
  57   0   0  57  39 _gcov_average_profiler.o (ex 
 ./lib64/gcc/x86_64-suse-linux/4.8/libgcov.a)
  52   0   0  52  34 _gcov_ior_profiler.o (ex 
 ./lib64/gcc/x86_64-suse-linux/4.8/libgcov.a)
 125   0   0 125  7d _gcov_merge_ior.o (ex 
 ./lib64/gcc/x86_64-suse-linux/4.8/libgcov.a)

2014-01-08  Rong Xu  x...@google.com

* libgcc/libgcov-driver.c (this_prg): make it local to save
bss space.
(gcov_exit_compute_summary): Ditto.
(gcov_exit_merge_gcda): Ditto.
(gcov_exit_merge_summary): Ditto.
(gcov_exit_dump_gcov): Ditto.

Index: libgcc/libgcov-driver.c
===
--- libgcc/libgcov-driver.c (revision 206437)
+++ libgcc/libgcov-driver.c (working copy)
@@ -303,8 +303,6 @@ gcov_compute_histogram (struct gcov_summary *sum)
 }
 }
 
-/* summary for program.  */
-static struct gcov_summary this_prg;
 /* gcda filename.  */
 static char *gi_filename;
 /* buffer for the fn_data from another program.  */
@@ -317,10 +315,10 @@ static struct gcov_summary_buffer *sum_buffer;
 static int run_accounted = 0;
 
 /* This funtions computes the program level summary and the histo-gram.
-   It computes and returns CRC32  and stored summari in THIS_PRG.  */
+   It computes and returns CRC32 and stored summary in THIS_PRG.  */
 
 static gcov_unsigned_t
-gcov_exit_compute_summary (void)
+gcov_exit_compute_summary (struct gcov_summary *this_prg)
 {
   struct gcov_info *gi_ptr;
   const struct gcov_fn_info *gfi_ptr;
@@ -332,7 +330,7 @@ static gcov_unsigned_t
   gcov_unsigned_t crc32 = 0;
 
   /* Find the totals for this execution.  */
-  memset (this_prg, 0, sizeof (this_prg));
+  memset (this_prg, 0, sizeof (*this_prg));
   for (gi_ptr = gcov_list; gi_ptr; gi_ptr = gi_ptr-next)
 {
   crc32 = crc32_unsigned (crc32, gi_ptr-stamp);
@@ -357,7 +355,7 @@ static gcov_unsigned_t
   if (!gi_ptr-merge[t_ix])
 continue;
 
-  cs_ptr = this_prg.ctrs[t_ix];
+  cs_ptr = (this_prg-ctrs[t_ix]);
   cs_ptr-num += ci_ptr-num;
   crc32 = crc32_unsigned (crc32, ci_ptr-num);
 
@@ -371,7 +369,7 @@ static gcov_unsigned_t
 }
 }
 }
-  gcov_compute_histogram (this_prg);
+  gcov_compute_histogram (this_prg);
   return crc32;
 }
 
@@ -393,6 +391,7 @@ struct gcov_filename_aux{
 static int
 gcov_exit_merge_gcda (struct gcov_info *gi_ptr,
   struct gcov_summary *prg_p,
+  struct gcov_summary *this_prg,
   gcov_position_t *summary_pos_p,
   gcov_position_t *eof_pos_p,
  gcov_unsigned_t crc32)
@@ -446,7 +445,7 @@ gcov_exit_merge_gcda (struct gcov_info *gi_ptr,
 goto next_summary;
 
   for (t_ix = 0; t_ix != GCOV_COUNTERS_SUMMABLE; t_ix++)
-if (tmp.ctrs[t_ix].num != this_prg.ctrs[t_ix].num)
+if (tmp.ctrs[t_ix].num != this_prg-ctrs[t_ix].num)
   goto next_summary;
   *prg_p = tmp;
   *summary_pos_p = *eof_pos_p;
@@ -636,7 +635,7 @@ gcov_exit_write_gcda (const struct gcov_info *gi_p
 
 static int
 gcov_exit_merge_summary (const struct gcov_info *gi_ptr, struct gcov_summary 
*prg,
-gcov_unsigned_t crc32,
+ struct gcov_summary *this_prg, gcov_unsigned_t crc32,
 struct gcov_summary *all_prg __attribute__ ((unused)))
 {
   struct gcov_ctr_summary *cs_prg, *cs_tprg;
@@ -650,7 +649,7 @@ gcov_exit_merge_summary (const struct gcov_info *g
   for (t_ix = 0; t_ix  GCOV_COUNTERS_SUMMABLE; t_ix++)
 {
   cs_prg = (prg-ctrs[t_ix]);
-  cs_tprg = this_prg.ctrs[t_ix];
+  cs_tprg = (this_prg-ctrs[t_ix]);
 
   if (gi_ptr-merge[t_ix])
 {
@@ -719,7 +718,8 @@ gcov_exit_merge_summary (const struct gcov_info *g
 
 static void
 gcov_exit_dump_gcov (struct gcov_info *gi_ptr, struct gcov_filename_aux *gf,
-gcov_unsigned_t crc32, struct gcov_summary *all_prg)
+gcov_unsigned_t crc32, struct gcov_summary *all_prg,
+ struct gcov_summary *this_prg)
 {
   struct gcov_summary prg; /* summary for this object over all program.  */
   int error;
@@ -743,7 +743,7 @@ gcov_exit_dump_gcov (struct gcov_info *gi_ptr, str
   gcov_error (profiling:%s:Not a gcov data file\n, gi_filename);
   goto read_fatal;
 }
-  error = gcov_exit_merge_gcda (gi_ptr, prg, summary_pos, eof_pos,
+  error = gcov_exit_merge_gcda (gi_ptr, prg, this_prg, summary_pos, 
eof_pos,
crc32

Re: [RFC] libgcov.c re-factoring and offline profile-tool

2014-01-08 Thread Rong Xu
On Wed, Dec 18, 2013 at 9:28 AM, Xinliang David Li davi...@google.com wrote:

  #ifdef L_gcov_merge_ior
  /* The profile merging function that just adds the counters.  It is given
 -   an array COUNTERS of N_COUNTERS old counters and it reads the same 
 number
 -   of counters from the gcov file.  */
 +   an array COUNTERS of N_COUNTERS old counters.
 +   When SRC==NULL, it reads the same number of counters from the gcov file.
 +   Otherwise, it reads from SRC array.  */
  void
 -__gcov_merge_ior (gcov_type *counters, unsigned n_counters)
 +__gcov_merge_ior (gcov_type *counters, unsigned n_counters,
 +  gcov_type *src, unsigned w __attribute__ ((unused)))

 So the new in-memory variants are introduced for merging tool, while libgcc 
 use gcov_read_counter
 interface?
 Perhaps we can actually just duplicate the functions to avoid runtime to do 
 all the scalling
 and in_mem tests it won't need?


 I thought about this one a little. How about making the interface
 change conditionally, but still share the implementation?  The merge
 function bodies mostly remain unchanged and there is no runtime
 penalty for libgcov.  The new macros can be shared across most of the
 mergers.

 #ifdef IN_PREOFILE_TOOL
 #define GCOV_MERGE_EXTRA_ARGS  gcov_type *src, unsigned w
 #define GCOV_READ_COUNTER  *(src++) * w
 #else
 #define GCOV_MERGE_EXTRA_ARGS
 #define GCOV_READ_COUNTER gcov_read_counter ()
 #endif

 __gcov_merge_add (gcov_type *counters, unsigned n_counters,
   GCOV_MERGE_EXTRA_ARGS)
 {

  for (; n_counters; counters++, n_counters--)
   {
   *counters += GCOV_READ_COUNTER ;
}

 }

 thanks,

Personally I don't think the run time test of in_mem will cause any
issue. This is in profile dumping, why don't we care a few more cycle
heres? it won't pollute the profile.

If you really don't like that, we can use the above approach, or I can
hide the logic in gcov_read_counter(), i.e. overload
gcov_read_counter() in profile_tool. For that, I will need a new
global variable SRC and set it before calling the merge function.
I would prefer to keep weight in _gcov_merge_* argument list.

What do you think?

-Rong


 David


 I would suggest going with libgcov.h changes and clenaups first, with 
 interface changes next
 and the gcov-tool is probably quite obvious at the end?
 Do you think you can split the patch this way?

 Thanks and sorry for taking long to review. I should have more time again 
 now.
 Honza


Re: [RFC] libgcov.c re-factoring and offline profile-tool

2014-01-08 Thread Rong Xu
On Fri, Dec 6, 2013 at 6:23 AM, Jan Hubicka hubi...@ucw.cz wrote:
 @@ -325,6 +311,9 @@ static struct gcov_summary all_prg;
  #endif
  /* crc32 for this program.  */
  static gcov_unsigned_t crc32;
 +/* Use this summary checksum rather the computed one if the value is
 + *non-zero.  */
 +static gcov_unsigned_t saved_summary_checksum;

 Why do you need to save the checksum? Won't it reset summary back with 
 multiple streaming?

This was for the gcov_tool. checksum will be recomputed in gcov_exit
and the value will depend on
the order of gcov_info list. (the order will be different after
reading from gcda files to memory). The purpose was
to have the same summary_checksum so that I can get identical gcov-dump output.


 I would really like to avoid introducing those static vars that are used 
 exclusively
 by gcov_exit.  What about putting them into an gcov_context structure that
 is passed around the functions that was broken out?

With my recently patch the localizes this_prg, we only use 64 more
bytes in bss. Do you still we have to remove
all these statics?




Re: [PATCH] Conditional count update for fast coverage test in multi-threaded programs

2013-12-20 Thread Rong Xu
%-75.50%
benchmark_9   81.6+9.91%  +14.10%-69.81%
benchmark_10  69.8   +23.60%   -1.53% --
benchmark_11  89.9+0.34%   +4.32%-59.62%
benchmark_12  98.7+0.96%   +0.88%-10.52%
benchmark_13  80.5-8.31%   -0.55%-75.90%
benchmark_14  17.1  +429.47%   -5.21% -4.87%
benchmark_15  22.0  +295.70%   -1.29%-46.36%
benchmark_16  80.0-6.31%   +0.54%-62.61%
benchmark_17  83.5+4.84%  +10.71%-70.48%
benchmark_18  90.0-1.27%   +3.18%-72.30%
geometric mean   +24.51%   +4.81%  -62.44% (incomplete)

benchmark_13  77.1 -3.34%+5.75% --

On Mon, Nov 25, 2013 at 11:19 AM, Rong Xu x...@google.com wrote:
 On Mon, Nov 25, 2013 at 2:11 AM, Richard Biener
 richard.guent...@gmail.com wrote:
 On Fri, Nov 22, 2013 at 10:49 PM, Rong Xu x...@google.com wrote:
 On Fri, Nov 22, 2013 at 4:03 AM, Richard Biener
 richard.guent...@gmail.com wrote:
 On Fri, Nov 22, 2013 at 4:51 AM, Rong Xu x...@google.com wrote:
 Hi,

 This patch injects a condition into the instrumented code for edge
 counter update. The counter value will not be updated after reaching
 value 1.

 The feature is under a new parameter --param=coverage-exec_once.
 Default is disabled and setting to 1 to enable.

 This extra check usually slows the program down. For SPEC 2006
 benchmarks (all single thread programs), we usually see around 20%-35%
 slow down in -O2 coverage build. This feature, however, is expected to
 improve the coverage run speed for multi-threaded programs, because
 there virtually no data race and false sharing in updating counters.
 The improvement can be significant for highly threaded programs -- we
 are seeing 7x speedup in coverage test run for some non-trivial google
 applications.

 Tested with bootstrap.

 Err - why not simply emit

   counter = 1

 for the counter update itself with that --param (I don't like a --param
 for this either).

 I assume that CPUs can avoid data-races and false sharing for
 non-changing accesses?


 I'm not aware of any CPU having this feature. I think a write to the
 shared cache line to invalidate all the shared copies. I cannot find
 any reference on checking the value of the write. Do you have any
 pointer to the feature?

 I don't have any pointer - but I remember seeing this in the context
 of atomics thus it may be only in the context of using a xchg
 or cmpxchg instruction.  Which would make it non-portable to
 some extent (if you don't want to use atomic builtins here).


 cmpxchg should work here -- it's a conditional write so the data race
 /false sharing can be avoided.
 I'm comparing the performance b/w explicit branch vs cmpxchg and will
 report back.

 -Rong


 Richard.

 I just tested this implementation vs. simply setting to 1, using
 google search as the benchmark.
 This one is 4.5x faster. The test was done on Intel Westmere systems.

 I can change the parameter to an option.

 -Rong

 Richard.

 Thanks,

 -Rong


Re: [PATCH] Conditional count update for fast coverage test in multi-threaded programs

2013-11-25 Thread Rong Xu
On Mon, Nov 25, 2013 at 2:11 AM, Richard Biener
richard.guent...@gmail.com wrote:
 On Fri, Nov 22, 2013 at 10:49 PM, Rong Xu x...@google.com wrote:
 On Fri, Nov 22, 2013 at 4:03 AM, Richard Biener
 richard.guent...@gmail.com wrote:
 On Fri, Nov 22, 2013 at 4:51 AM, Rong Xu x...@google.com wrote:
 Hi,

 This patch injects a condition into the instrumented code for edge
 counter update. The counter value will not be updated after reaching
 value 1.

 The feature is under a new parameter --param=coverage-exec_once.
 Default is disabled and setting to 1 to enable.

 This extra check usually slows the program down. For SPEC 2006
 benchmarks (all single thread programs), we usually see around 20%-35%
 slow down in -O2 coverage build. This feature, however, is expected to
 improve the coverage run speed for multi-threaded programs, because
 there virtually no data race and false sharing in updating counters.
 The improvement can be significant for highly threaded programs -- we
 are seeing 7x speedup in coverage test run for some non-trivial google
 applications.

 Tested with bootstrap.

 Err - why not simply emit

   counter = 1

 for the counter update itself with that --param (I don't like a --param
 for this either).

 I assume that CPUs can avoid data-races and false sharing for
 non-changing accesses?


 I'm not aware of any CPU having this feature. I think a write to the
 shared cache line to invalidate all the shared copies. I cannot find
 any reference on checking the value of the write. Do you have any
 pointer to the feature?

 I don't have any pointer - but I remember seeing this in the context
 of atomics thus it may be only in the context of using a xchg
 or cmpxchg instruction.  Which would make it non-portable to
 some extent (if you don't want to use atomic builtins here).


cmpxchg should work here -- it's a conditional write so the data race
/false sharing can be avoided.
I'm comparing the performance b/w explicit branch vs cmpxchg and will
report back.

-Rong


 Richard.

 I just tested this implementation vs. simply setting to 1, using
 google search as the benchmark.
 This one is 4.5x faster. The test was done on Intel Westmere systems.

 I can change the parameter to an option.

 -Rong

 Richard.

 Thanks,

 -Rong


Re: [PATCH] Conditional count update for fast coverage test in multi-threaded programs

2013-11-22 Thread Rong Xu
On Fri, Nov 22, 2013 at 4:03 AM, Richard Biener
richard.guent...@gmail.com wrote:
 On Fri, Nov 22, 2013 at 4:51 AM, Rong Xu x...@google.com wrote:
 Hi,

 This patch injects a condition into the instrumented code for edge
 counter update. The counter value will not be updated after reaching
 value 1.

 The feature is under a new parameter --param=coverage-exec_once.
 Default is disabled and setting to 1 to enable.

 This extra check usually slows the program down. For SPEC 2006
 benchmarks (all single thread programs), we usually see around 20%-35%
 slow down in -O2 coverage build. This feature, however, is expected to
 improve the coverage run speed for multi-threaded programs, because
 there virtually no data race and false sharing in updating counters.
 The improvement can be significant for highly threaded programs -- we
 are seeing 7x speedup in coverage test run for some non-trivial google
 applications.

 Tested with bootstrap.

 Err - why not simply emit

   counter = 1

 for the counter update itself with that --param (I don't like a --param
 for this either).

 I assume that CPUs can avoid data-races and false sharing for
 non-changing accesses?


I'm not aware of any CPU having this feature. I think a write to the
shared cache line to invalidate all the shared copies. I cannot find
any reference on checking the value of the write. Do you have any
pointer to the feature?

I just tested this implementation vs. simply setting to 1, using
google search as the benchmark.
This one is 4.5x faster. The test was done on Intel Westmere systems.

I can change the parameter to an option.

-Rong

 Richard.

 Thanks,

 -Rong


[PATCH] Conditional count update for fast coverage test in multi-threaded programs

2013-11-21 Thread Rong Xu
Hi,

This patch injects a condition into the instrumented code for edge
counter update. The counter value will not be updated after reaching
value 1.

The feature is under a new parameter --param=coverage-exec_once.
Default is disabled and setting to 1 to enable.

This extra check usually slows the program down. For SPEC 2006
benchmarks (all single thread programs), we usually see around 20%-35%
slow down in -O2 coverage build. This feature, however, is expected to
improve the coverage run speed for multi-threaded programs, because
there virtually no data race and false sharing in updating counters.
The improvement can be significant for highly threaded programs -- we
are seeing 7x speedup in coverage test run for some non-trivial google
applications.

Tested with bootstrap.

Thanks,

-Rong
2013-11-21  Rong Xu  x...@google.com

* gcc/doc/invoke.texi (coverage-exec_once): Add document.
* gcc/params.def (coverage-exec_once): New param.
* gcc/profile.h (gcov_type sum_edge_counts): Add extern decl.
* gcc/profile.c (branch_prob): Add support for coverage-exec_once.
* gcc/tree-profile.c (gimple_gen_edge_profiler): Ditto.
(gimple_gen_ior_profiler): Ditto.
(insert_if_then): Ditto.
(add_execonce_wrapper): Ditto.
(is_pred_instrumentation_candidate): Ditto.
(add_predicate_to_edge_counters): Ditto.
(cleanup_pred_instrumentation): Ditto.
(init_pred_instrumentation): Ditto.
(tree_profiling): Ditto.

Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi (revision 205226)
+++ gcc/doc/invoke.texi (working copy)
@@ -9937,6 +9937,10 @@ The default choice depends on the target.
 Set the maximum number of existing candidates that will be considered when
 seeking a basis for a new straight-line strength reduction candidate.
 
+@item coverage-exec_once
+Set to 1 to update each arc counter only once. This avoids false sharing
+and speeds up profile-generate run for multi-threaded programs.
+
 @end table
 @end table
 
Index: gcc/params.def
===
--- gcc/params.def  (revision 205226)
+++ gcc/params.def  (working copy)
@@ -441,6 +441,14 @@ DEFPARAM(TRACER_MIN_BRANCH_PROBABILITY,
 Stop forward growth if the probability of best edge is less than this 
threshold (in percent). Used when profile feedback is not available,
 50, 0, 100)
 
+/* This parameter enables fast coverage test. Once the counter flips from 0
+   to 1, we no longer updating the value . This avoids false sharing and speeds
+   up profile-generate run for multi-threaded programs.  */
+DEFPARAM (PARAM_COVERAGE_EXEC_ONCE,
+ coverage-exec_once,
+ Stop incrementing edge counts once they become 1.,
+ 0, 0, 1)
+
 /* The maximum number of incoming edges to consider for crossjumping.  */
 DEFPARAM(PARAM_MAX_CROSSJUMP_EDGES,
 max-crossjump-edges,
Index: gcc/profile.c
===
--- gcc/profile.c   (revision 205226)
+++ gcc/profile.c   (working copy)
@@ -67,6 +67,7 @@ along with GCC; see the file COPYING3.  If not see
 #include cfgloop.h
 #include dumpfile.h
 #include cgraph.h
+#include params.h
 
 #include profile.h
 
@@ -1327,6 +1328,9 @@ branch_prob (void)
 
   /* Commit changes done by instrumentation.  */
   gsi_commit_edge_inserts ();
+
+  if (PARAM_VALUE (PARAM_COVERAGE_EXEC_ONCE))
+add_predicate_to_edge_counters ();
 }
 
   free_aux_for_edges ();
Index: gcc/profile.h
===
--- gcc/profile.h   (revision 205226)
+++ gcc/profile.h   (working copy)
@@ -46,6 +46,9 @@ extern gcov_type sum_edge_counts (vecedge, va_gc
 extern void init_node_map (bool);
 extern void del_node_map (void);
 
+/* Insert a predicate to edge counter update stmt.  */
+extern void add_predicate_to_edge_counters (void);
+
 extern void get_working_sets (void);
 
 /* In predict.c.  */
Index: gcc/tree-profile.c
===
--- gcc/tree-profile.c  (revision 205226)
+++ gcc/tree-profile.c  (working copy)
@@ -52,6 +52,7 @@ along with GCC; see the file COPYING3.  If not see
 #include target.h
 #include tree-cfgcleanup.h
 #include tree-nested.h
+#include params.h
 
 static GTY(()) tree gcov_type_node;
 static GTY(()) tree tree_interval_profiler_fn;
@@ -67,6 +68,11 @@ static GTY(()) tree ic_void_ptr_var;
 static GTY(()) tree ic_gcov_type_ptr_var;
 static GTY(()) tree ptr_void;
 
+/* A pointer-set of the last statement in each block of statements that need to
+   be applied a predicate .  */
+static struct pointer_set_t *pred_instrumentation_candidate = NULL;
+
+
 /* Do initialization work for the edge profiler.  */
 
 /* Add code:
@@ -295,6 +301,10 @@ gimple_gen_edge_profiler (int edgeno, edge e)
   stmt2

Re: atomic update of profile counters (issue7000044)

2013-11-20 Thread Rong Xu
On Tue, Nov 19, 2013 at 5:11 PM, Andrew Pinski pins...@gmail.com wrote:
 On Tue, Nov 19, 2013 at 5:02 PM, Rong Xu x...@google.com wrote:
 Hi all,

 I merged this old patch with current trunk. I also make the following changes
 (1) not using weak references. Now every *profile_atomic() has it's
 own .o so that none of them will be in the final binary if
 -fprofile-generate-atomic is not specified.
 (2) more value profilers have the atomic version.
 (3) not link to libatomic. I used to link the libatomic in the
 presence of -fprofile-generate-atomic, per Andrew's suggestion. It
 used to work. But now if I can add -latomic in the SPEC, it cannot
 find the libatomic.so.1 (unless I specify the PATH). I did not find an
 easy way to statically link libatomic.a. Andrew: Do you have any
 suggestion? Or should we let the user link to libatomic.a if the
 builtins are not expanded?

 It should work for an installed GCC.  For testing you might need
 something that is included inside testsuite/lib/atomic-dg.exp which
 sets the library path to include libatomic build directory.

When I change the SPEC to include libatomic,
the compiler can find libatomic. I.e. using
 gcc -O2 -fprofile-generate -fprofile-generate-atomic=3 hello_world.c
generates a.out without any problem.

But since there are both shared and static libatomic in lib64, it
chooses to use the dynamic one.
 ldd a.out
linux-vdso.so.1 =  (0x7fff56bff000)
libatomic.so.1 = not found
libc.so.6 = /lib/x86_64-linux-gnu/libc.so.6 (0x2b0720261000)
/lib64/ld-linux-x86-64.so.2 (0x2b072003c000)

 ./a.out
./a.out: error while loading shared libraries: libatomic.so.1: cannot
open shared object file: No such file or directory

while
 LD_LIBRARY_PATH=gcc_install_patch/lib64 ./a.out
works fine.

I think that's the same reason we set the library path in
testsuite/lib/atomic-dg.exp, because gcc_install_patch/lib64
is not in the dynamic library search list.

I could do this in the SPEC
  -Wl,-Bstatic -latomic -Wl,-Bdynamic
which would link libatomic statically.
I works for me. But it looks a little weird in gcc driver.

Index: gcc.c
===
--- gcc.c   (revision 205053)
+++ gcc.c   (working copy)
@@ -771,7 +771,8 @@
 %{fopenmp|ftree-parallelize-loops=*:%:include(libgomp.spec)%(link_gomp)}\
 %{fgnu-tm:%:include(libitm.spec)%(link_itm)}\
 %(mflib)  STACK_SPLIT_SPEC \
-%{fprofile-arcs|fprofile-generate*|coverage:-lgcov}  SANITIZER_SPEC  \
+%{fprofile-arcs|fprofile-generate*|coverage:-lgcov\
+  %{fprofile-generate-atomic=*:-Wl,-Bstatic -latomic
-Wl,-Bdynamic}}  SANITIZER_SPEC  \
 %{!nostdlib:%{!nodefaultlibs:%(link_ssp) %(link_gcc_c_sequence)}}\
 %{!nostdlib:%{!nostartfiles:%E}} %{T*} }}
 #endif



 I think now we require libatomic in more cases (C11 atomic support for
 an example).

 Thanks,
 Andrew Pinski


 Is this OK for trunk?

 Thanks,

 -Rong

 On Mon, Jan 7, 2013 at 12:55 PM, Rong Xu x...@google.com wrote:
 Function __gcov_indirect_call_profiler_atomic (which contains call to
 the atomic function) is always emitted in libgcov.
 Since we only link libatomic when -fprofile-gen-atomic is specified,
 we have to make the atomic function weak -- otherwise, there is a
 unsat for regular FDO gen build (of course, when the builtin is not
 expanded).

 An alternative it to always link libatomic together with libgcov. Then
 we don't need the weak stuff. I'm not sure when one is better.

 -Rong

 On Mon, Jan 7, 2013 at 12:36 PM, Richard Henderson r...@redhat.com wrote:
 On 01/03/2013 04:42 PM, Rong Xu wrote:
 It links libatomic when -fprofile-gen-atomic is specified for FDO
 instrumentation build. Here I assume libatomic is always installed.
 Andrew: do you think if this is reasonable?

 It also disables the functionality if target does not support weak
 (ie. TARGET_SUPPORTS_WEAK == 0).

 Since you're linking libatomic, you don't need weak references.

 I think its ok to assume libatomic is installed, given that the
 user has had to explicitly use the command-line option.


 r~


Re: atomic update of profile counters (issue7000044)

2013-11-20 Thread Rong Xu
Joseph and Andrew, thanks for the suggestion. That's really helpful.

Here is the new patch for gcc.c.
Basically, it's just what you have suggested: enclosing -latomic with
--as-needed, and using macros.
For the case of no --as-needed support, I use static link. (just found
that some code already using this in the SPEC).
I'm flexible on this part -- if you think this is unnecessary, I can remove.

Thanks,

-Rong

Index: gcc.c
===
--- gcc.c   (revision 205053)
+++ gcc.c   (working copy)
@@ -748,6 +748,23 @@ proper position among the other output files.  */
 %{fvtable-verify=preinit: -lvtv -u_vtable_map_vars_start
-u_vtable_map_vars_end}}
 #endif

+/* This spec is for linking in libatomic in gcov atomic counter update.
+   We will use the atomic functions defined in libatomic, only when the builtin
+   versions are not available. In the case of no LD_AS_NEEDED support, we
+   link libatomic statically.  */
+
+#ifndef GCOV_ATOMIC_SPEC
+#if USE_LD_AS_NEEDED
+#define GCOV_ATOMIC_SPEC %{fprofile-generate-atomic=*: LD_AS_NEEDED_OPTION \
+   -latomic}  LD_NO_AS_NEEDED_OPTION
+#elif defined(HAVE_LD_STATIC_DYNAMIC)
+#define GCOV_ATOMIC_SPEC %{fprofile-generate-atomic=*: LD_STATIC_OPTION \
+ -latomic  LD_DYNAMIC_OPTION }
+#else /* !USE_LD_AS_NEEDED  !HAVE_LD_STATIC_DYNAMIC  */
+#define GCOV_ATOMIC_SPEC %{fprofile-generate-atomic=*:-latomic}
+#endif
+#endif
+
 /* -u* was put back because both BSD and SysV seem to support it.  */
 /* %{static:} simply prevents an error message if the target machine
doesn't handle -static.  */
@@ -771,7 +788,8 @@ proper position among the other output files.  */
 %{fopenmp|ftree-parallelize-loops=*:%:include(libgomp.spec)%(link_gomp)}\
 %{fgnu-tm:%:include(libitm.spec)%(link_itm)}\
 %(mflib)  STACK_SPLIT_SPEC \
-%{fprofile-arcs|fprofile-generate*|coverage:-lgcov}  SANITIZER_SPEC  \
+%{fprofile-arcs|fprofile-generate*|coverage:-lgcov\
+   GCOV_ATOMIC_SPEC }  SANITIZER_SPEC  \
 %{!nostdlib:%{!nodefaultlibs:%(link_ssp) %(link_gcc_c_sequence)}}\
 %{!nostdlib:%{!nostartfiles:%E}} %{T*} }}



On Wed, Nov 20, 2013 at 1:04 PM, Joseph S. Myers
jos...@codesourcery.com wrote:
 On Wed, 20 Nov 2013, Rong Xu wrote:

 I could do this in the SPEC
   -Wl,-Bstatic -latomic -Wl,-Bdynamic
 which would link libatomic statically.
 I works for me. But it looks a little weird in gcc driver.

 I think we should generally link libatomic with --as-needed by default on
 platforms supporting --as-needed, in line with the general principle that
 C code just using language not library facilities (_Atomic in this case)
 shouldn't need any special options to link it (libatomic is like libgcc,
 which is linked in automatically); the trickier question is what to do
 with it on any systems supporting shared libraries but not --as-needed.

 --
 Joseph S. Myers
 jos...@codesourcery.com


Re: atomic update of profile counters (issue7000044)

2013-11-20 Thread Rong Xu
OK. Sorry for miss-reading the message.
In that case, linking in libatomic becomes a separate issue. We don't
need to touch gcc.c in this patch.

Thanks,

-Rong

On Wed, Nov 20, 2013 at 2:19 PM, Andrew Pinski pins...@gmail.com wrote:
 On Wed, Nov 20, 2013 at 2:07 PM, Rong Xu x...@google.com wrote:
 Joseph and Andrew, thanks for the suggestion. That's really helpful.

 Here is the new patch for gcc.c.
 Basically, it's just what you have suggested: enclosing -latomic with
 --as-needed, and using macros.
 For the case of no --as-needed support, I use static link. (just found
 that some code already using this in the SPEC).
 I'm flexible on this part -- if you think this is unnecessary, I can remove.


 I think Joseph's suggestion was also to include -latomic even when not
 generating atomic profiling due to the C11 code requiring it.

 Thanks,
 Andrew


 Thanks,

 -Rong

 Index: gcc.c
 ===
 --- gcc.c   (revision 205053)
 +++ gcc.c   (working copy)
 @@ -748,6 +748,23 @@ proper position among the other output files.  */
  %{fvtable-verify=preinit: -lvtv -u_vtable_map_vars_start
 -u_vtable_map_vars_end}}
  #endif

 +/* This spec is for linking in libatomic in gcov atomic counter update.
 +   We will use the atomic functions defined in libatomic, only when the 
 builtin
 +   versions are not available. In the case of no LD_AS_NEEDED support, we
 +   link libatomic statically.  */
 +
 +#ifndef GCOV_ATOMIC_SPEC
 +#if USE_LD_AS_NEEDED
 +#define GCOV_ATOMIC_SPEC %{fprofile-generate-atomic=*: 
 LD_AS_NEEDED_OPTION \
 +   -latomic}  LD_NO_AS_NEEDED_OPTION
 +#elif defined(HAVE_LD_STATIC_DYNAMIC)
 +#define GCOV_ATOMIC_SPEC %{fprofile-generate-atomic=*: LD_STATIC_OPTION \
 + -latomic  LD_DYNAMIC_OPTION }
 +#else /* !USE_LD_AS_NEEDED  !HAVE_LD_STATIC_DYNAMIC  */
 +#define GCOV_ATOMIC_SPEC %{fprofile-generate-atomic=*:-latomic}
 +#endif
 +#endif
 +
  /* -u* was put back because both BSD and SysV seem to support it.  */
  /* %{static:} simply prevents an error message if the target machine
 doesn't handle -static.  */
 @@ -771,7 +788,8 @@ proper position among the other output files.  */
  
 %{fopenmp|ftree-parallelize-loops=*:%:include(libgomp.spec)%(link_gomp)}\
  %{fgnu-tm:%:include(libitm.spec)%(link_itm)}\
  %(mflib)  STACK_SPLIT_SPEC \
 -%{fprofile-arcs|fprofile-generate*|coverage:-lgcov}  SANITIZER_SPEC  \
 +%{fprofile-arcs|fprofile-generate*|coverage:-lgcov\
 +   GCOV_ATOMIC_SPEC }  SANITIZER_SPEC  \
  %{!nostdlib:%{!nodefaultlibs:%(link_ssp) %(link_gcc_c_sequence)}}\
  %{!nostdlib:%{!nostartfiles:%E}} %{T*} }}



 On Wed, Nov 20, 2013 at 1:04 PM, Joseph S. Myers
 jos...@codesourcery.com wrote:
 On Wed, 20 Nov 2013, Rong Xu wrote:

 I could do this in the SPEC
   -Wl,-Bstatic -latomic -Wl,-Bdynamic
 which would link libatomic statically.
 I works for me. But it looks a little weird in gcc driver.

 I think we should generally link libatomic with --as-needed by default on
 platforms supporting --as-needed, in line with the general principle that
 C code just using language not library facilities (_Atomic in this case)
 shouldn't need any special options to link it (libatomic is like libgcc,
 which is linked in automatically); the trickier question is what to do
 with it on any systems supporting shared libraries but not --as-needed.

 --
 Joseph S. Myers
 jos...@codesourcery.com


Re: [RFC] libgcov.c re-factoring and offline profile-tool

2013-11-20 Thread Rong Xu
Ping.

On Mon, Nov 18, 2013 at 12:31 PM, Xinliang David Li davi...@google.com wrote:
 gcov-dump tool does raw dump of profile data. In the long run, we
 should have only one gcov profile manipulation tool, so it might be
 better to incorporate gcov-dump into gcov-tool and get rid of
 'gcov-dump'.

 David

 On Mon, Nov 18, 2013 at 12:24 PM, Rong Xu x...@google.com wrote:
 Hi, all

 This is the new patch for gcov-tool (previously profile-tool).

 Honza: can you comment on the new merge interface? David posted some
 comments in an earlier email and we want to know what's your opinion.

 Test patch has been tested with boostrap, regresssion,
 profiledbootstrap and SPEC2006.

 Noticeable changes from the earlier version:

 1. create a new file libgcov.h and move libgcov-*.h headers to libgcov.h
 So we can included multiple libgcov-*.c without adding new macros.

 2. split libgcov.h specific code in gcvo-io.h to libcc/libgcov.h
 Avoid multiple-page of code under IN_LIBGCOV macro -- this
 improves the readability.

 3. make gcov_var static, and move the definition from gcov-io.h to
 gcov-io.c. Also
move some static functions accessing gcov_var to gcvo-io.c
 Current code rely on GCOV_LINKAGE tricks to avoid multi-definition. I don't 
 see
 a reason that gcov_var needs to exposed as a global.

 4. expose gcov_write_strings() and gcov_sync() to gcov_tool usage

 5. rename profile-tool to gcov-tool per Honza's suggestion.

 Thanks,

 -Rong

 On Tue, Nov 12, 2013 at 4:27 PM, Rong Xu x...@google.com wrote:
 The patch was checked in as r204730.

 Tested with
 configure with --enable-languages=all,obj-c++
 (ada is currently broken, so I was not be able test)
 bootstrap and profiledbootstrap
 regression for -m32 and -m64.
 spec2006

 The only semantic change is __gcov_flush() used to be under L_gcov.
 I put the function to libgcov-interface.c and made it under
 L_gcov_flush (newly added).
 So if you need this function, you have to enable L_gcov_flush in the
 libgcc/Makefile.in.

 Thanks,

 -Rong

 On Mon, Nov 11, 2013 at 10:19 AM, Rong Xu x...@google.com wrote:
 On Mon, Nov 11, 2013 at 7:02 AM, Jan Hubicka hubi...@ucw.cz wrote:
 2013-11-04  Rong Xu  x...@google.com

   * libgcc/libgcov.c: Delete as part of re-factoring.
   * libgcc/libgcov-profiler.c (__gcov_interval_profiler): Moved from
   libgcov.c
   (__gcov_pow2_profiler): Ditto.
   (__gcov_one_value_profiler_body): Ditto.
   (__gcov_one_value_profiler): Ditto.
   (__gcov_indirect_call_profiler): Ditto.
   (__gcov_indirect_call_profiler_v2): Ditto.
   (__gcov_average_profiler): Ditto.
   (__gcov_ior_profiler): Ditto.
   * libgcc/libgcov-driver.c (this_prg): Make it file scope
   static variable.
   (all_prg): Ditto.
   (crc32): Ditto.
   (gi_filename): Ditto.
   (fn_buffer): Ditto.
   (sum_buffer): Ditto.
 (struct gcov_filename_aux): New types to store auxiliary 
 information
 for gi_filename.
   (gcov_version): Moved from libgcov.c.
   (crc32_unsigned): Ditto.
   (gcov_histogram_insert): Ditto.
   (gcov_compute_histogram): Ditto.
   (gcov_exit): Ditto.
   (gcov_clear): Ditto.
   (__gcov_init): Ditto.
   (gcov_exit_compute_summary): New function split from gcov_exit().
   (gcov_exit_merge_gcda): Ditto.
   (gcov_exit_write_gcda): Ditto.
   (gcov_exit_dump_gcov): Ditto.
   * libgcc/libgcov-interface.c (init_mx): Moved from libgcov.c.
   (init_mx_once): Ditto.
   (__gcov_flush): Ditto.
   (__gcov_reset): Ditto.
   (__gcov_dump): Ditto.
   (__gcov_fork): Ditto.
   (__gcov_execl): Ditto.
   (__gcov_execlp): Ditto.
   (__gcov_execle): Ditto.
   (__gcov_execv): Ditto.
   (__gcov_execvp): Ditto.
   (__gcov_execve): Ditto.
   * libgcc/libgcov-driver-system.c (gcov_error): New utility 
 function.
   (allocate_filename_struct): New function split from gcov_exit().
   (gcov_exit_open_gcda_file): Ditto.
 (create_file_directory): Moved from libgcov.c.
   * libgcc/libgcov-merge.c:
   (__gcov_merge_add): Moved from libgcov.c.
   (__gcov_merge_ior): Ditto.
   (__gcov_merge_single): Ditto.
   (__gcov_merge_delta): Ditto.
   * libgcc/Makefile.in: Change to build newly added files.
   * gcc/gcov-io.h (__gcov_merge_ior): Add the decl to avoid warning.


 Hi,
 the patch looks resonable. I take your word on the fact that there are no 
 changes
 in the code, I did not read it all ;)))

 We did split gcov_exit() into more functions.
 But, semantically the code is the same.

 Index: libgcc/Makefile.in
 ===
 --- libgcc/Makefile.in(revision 204285)
 +++ libgcc/Makefile.in(working copy)
 @@ -853,17 +853,37 @@ include $(iterator)
  # Build libgcov components.

  # Defined in libgcov.c, included only in gcov library
 -LIBGCOV = _gcov _gcov_merge_add _gcov_merge_single _gcov_merge_delta

Re: Fix summary generation with fork

2013-11-19 Thread Rong Xu
On Mon, Nov 18, 2013 at 2:15 PM, Jan Hubicka hubi...@ucw.cz wrote:
 Hi,
 this patch fixes problem we noticed with Martin Liska where gcov_dump is 
 called
 several times per execution of firefox (on each fork and exec).  This causes
 runs to be large and makes functions executed once per program to be 
 considered
 cold.

 This patch makes us to update runs just once per execution and not on each
 streaming of profile data.  While testing it I also noticed that program
 summary is now broken, since crc32 is accumulated per each dumping instead 
 just
 once.

You are right. I forgot that gcov_exit() can be called multiple times.
Should have
initialized crc32 per gcov_exit() call.

Thanks for the fix.

 I believe the newly introduced static vars should go - there is nothing really
 preventing us from doing two concurent updates and also it just unnecesary
 increases to footprint of libgcov.  I converted thus all_prg and crc32 back to
 local vars.

 Bootstrapped/regtested x86_64-linux, comitted.

 * libgcov-driver.c (get_gcov_dump_complete): Update comments.
 (all_prg, crc32): Remove static vars.
 (gcov_exit_compute_summary): Rewrite to return crc32; do not clear
 all_prg.
 (gcov_exit_merge_gcda): Add crc32 parameter.
 (gcov_exit_merge_summary): Add crc32 and all_prg parameter;
 do not account run if it was already accounted.
 (gcov_exit_dump_gcov): Add crc32 and all_prg parameters.
 (gcov_exit): Initialize all_prg; update.
 Index: libgcov-driver.c
 ===
 --- libgcov-driver.c(revision 204945)
 +++ libgcov-driver.c(working copy)
 @@ -96,7 +96,7 @@ static size_t gcov_max_filename = 0;
  /* Flag when the profile has already been dumped via __gcov_dump().  */
  static int gcov_dump_complete;

 -/* A global functino that get the vaule of gcov_dump_complete.  */
 +/* A global function that get the vaule of gcov_dump_complete.  */

  int
  get_gcov_dump_complete (void)
 @@ -319,12 +319,6 @@ gcov_compute_histogram (struct gcov_summ

  /* summary for program.  */
  static struct gcov_summary this_prg;
 -#if !GCOV_LOCKED
 -/* summary for all instances of program.  */
 -static struct gcov_summary all_prg;
 -#endif
 -/* crc32 for this program.  */
 -static gcov_unsigned_t crc32;
  /* gcda filename.  */
  static char *gi_filename;
  /* buffer for the fn_data from another program.  */
 @@ -333,10 +327,9 @@ static struct gcov_fn_buffer *fn_buffer;
  static struct gcov_summary_buffer *sum_buffer;

  /* This funtions computes the program level summary and the histo-gram.
 -   It initializes ALL_PRG, computes CRC32, and stores the summary in
 -   THIS_PRG. All these three variables are file statics.  */
 +   It computes and returns CRC32  and stored summari in THIS_PRG.  */

 -static void
 +static gcov_unsigned_t
  gcov_exit_compute_summary (void)
  {
struct gcov_info *gi_ptr;
 @@ -346,10 +339,8 @@ gcov_exit_compute_summary (void)
int f_ix;
unsigned t_ix;
gcov_unsigned_t c_num;
 +  gcov_unsigned_t crc32 = 0;

 -#if !GCOV_LOCKED
 -  memset (all_prg, 0, sizeof (all_prg));
 -#endif
/* Find the totals for this execution.  */
memset (this_prg, 0, sizeof (this_prg));
for (gi_ptr = gcov_list; gi_ptr; gi_ptr = gi_ptr-next)
 @@ -391,6 +382,7 @@ gcov_exit_compute_summary (void)
  }
  }
gcov_compute_histogram (this_prg);
 +  return crc32;
  }

  /* A struct that bundles all the related information about the
 @@ -412,7 +404,8 @@ static int
  gcov_exit_merge_gcda (struct gcov_info *gi_ptr,
struct gcov_summary *prg_p,
gcov_position_t *summary_pos_p,
 -  gcov_position_t *eof_pos_p)
 +  gcov_position_t *eof_pos_p,
 + gcov_unsigned_t crc32)
  {
gcov_unsigned_t tag, length;
unsigned t_ix;
 @@ -652,13 +645,19 @@ gcov_exit_write_gcda (const struct gcov_
 Return -1 on error. Return 0 on success.  */

  static int
 -gcov_exit_merge_summary (const struct gcov_info *gi_ptr, struct gcov_summary 
 *prg)
 +gcov_exit_merge_summary (const struct gcov_info *gi_ptr, struct gcov_summary 
 *prg,
 +gcov_unsigned_t crc32, struct gcov_summary *all_prg)
  {
struct gcov_ctr_summary *cs_prg, *cs_tprg;
 -#if !GCOV_LOCKED
 -  struct gcov_ctr_summary *cs_all;
 -#endif
unsigned t_ix;
 +  /* If application calls fork or exec multiple times, we end up storing
 + profile repeadely.  We should not account this as multiple runs or
 + functions executed once may mistakely become cold.  */
 +  static int run_accounted = 0;
 +#if !GCOV_LOCKED
 +  /* summary for all instances of program.  */
 +  struct gcov_ctr_summary *cs_all;
 +#endif

/* Merge the summaries.  */
for (t_ix = 0; t_ix  GCOV_COUNTERS_SUMMABLE; t_ix++)
 @@ -668,13 +667,18 @@ gcov_exit_merge_summary (const struct gc

if (gi_ptr-merge[t_ix])
  {
 -   

Re: atomic update of profile counters (issue7000044)

2013-11-19 Thread Rong Xu
Hi all,

I merged this old patch with current trunk. I also make the following changes
(1) not using weak references. Now every *profile_atomic() has it's
own .o so that none of them will be in the final binary if
-fprofile-generate-atomic is not specified.
(2) more value profilers have the atomic version.
(3) not link to libatomic. I used to link the libatomic in the
presence of -fprofile-generate-atomic, per Andrew's suggestion. It
used to work. But now if I can add -latomic in the SPEC, it cannot
find the libatomic.so.1 (unless I specify the PATH). I did not find an
easy way to statically link libatomic.a. Andrew: Do you have any
suggestion? Or should we let the user link to libatomic.a if the
builtins are not expanded?

Is this OK for trunk?

Thanks,

-Rong

On Mon, Jan 7, 2013 at 12:55 PM, Rong Xu x...@google.com wrote:
 Function __gcov_indirect_call_profiler_atomic (which contains call to
 the atomic function) is always emitted in libgcov.
 Since we only link libatomic when -fprofile-gen-atomic is specified,
 we have to make the atomic function weak -- otherwise, there is a
 unsat for regular FDO gen build (of course, when the builtin is not
 expanded).

 An alternative it to always link libatomic together with libgcov. Then
 we don't need the weak stuff. I'm not sure when one is better.

 -Rong

 On Mon, Jan 7, 2013 at 12:36 PM, Richard Henderson r...@redhat.com wrote:
 On 01/03/2013 04:42 PM, Rong Xu wrote:
 It links libatomic when -fprofile-gen-atomic is specified for FDO
 instrumentation build. Here I assume libatomic is always installed.
 Andrew: do you think if this is reasonable?

 It also disables the functionality if target does not support weak
 (ie. TARGET_SUPPORTS_WEAK == 0).

 Since you're linking libatomic, you don't need weak references.

 I think its ok to assume libatomic is installed, given that the
 user has had to explicitly use the command-line option.


 r~
2013-11-19  Rong Xu  x...@google.com

* gcc/gcov-io.h: Add atomic function macros for compiler use.
* gcc/common.opt (fprofile-generate-atomic): New option.
* gcc/tree-profile.c (gimple_init_edge_profiler): Support for
atomic counter update.
(gimple_gen_edge_profiler): Ditto.
* libgcc/libgcov-profiler.c 
(__gcov_interval_profiler_atomic): Ditto.
(__gcov_pow2_profiler_atomic): Ditto.
(__gcov_one_value_profiler_body_atomic): Ditto.
(__gcov_one_value_profiler_atomic): Ditto.
(__gcov_indirect_call_profiler_atomic): Ditto.
(__gcov_indirect_call_profiler_v2_atomic): Ditto.
(__gcov_time_profiler_atomic): Ditto.
(__gcov_average_profiler_atomic): Ditto.
* gcc/gcc.c: Link libatomic when -fprofile-generate-atomic used.
* libgcc/Makefile.in: Add atomic objects.

Index: gcc/common.opt
===
--- gcc/common.opt  (revision 205053)
+++ gcc/common.opt  (working copy)
@@ -1684,6 +1684,15 @@ fprofile-correction
 Common Report Var(flag_profile_correction)
 Enable correction of flow inconsistent profile data input
 
+; fprofile-generate-atomic=0: default and disable atomical update.
+; fprofile-generate-atomic=1: atomically update edge profile counters.
+; fprofile-generate-atomic=2: atomically update value profile counters.
+; fprofile-generate-atomic=3: atomically update edge and value profile 
counters.
+; other values will be ignored (fall back to the default of 0).
+fprofile-generate-atomic=
+Common Joined UInteger Report Var(flag_profile_generate_atomic) Init(3) 
Optimization
+fprofile-generate-atomic=[0..3] Atomical increments of profile counters.
+
 fprofile-generate
 Common
 Enable common options for generating profile info for profile feedback 
directed optimizations
Index: gcc/tree-profile.c
===
--- gcc/tree-profile.c  (revision 205053)
+++ gcc/tree-profile.c  (working copy)
@@ -155,6 +155,9 @@ gimple_init_edge_profiler (void)
   tree ic_profiler_fn_type;
   tree average_profiler_fn_type;
   tree time_profiler_fn_type;
+  const char *fn_name;
+  bool profile_gen_value_atomic = (flag_profile_generate_atomic == 2 ||
+   flag_profile_generate_atomic == 3);
 
   if (!gcov_type_node)
 {
@@ -167,9 +170,10 @@ gimple_init_edge_profiler (void)
  gcov_type_ptr, gcov_type_node,
  integer_type_node,
  unsigned_type_node, NULL_TREE);
+  fn_name = profile_gen_value_atomic ? __gcov_interval_profiler_atomic : 
+ __gcov_interval_profiler;
   tree_interval_profiler_fn
- = build_fn_decl (__gcov_interval_profiler,
-interval_profiler_fn_type);
+ = build_fn_decl (fn_name, interval_profiler_fn_type);
   TREE_NOTHROW (tree_interval_profiler_fn) = 1

Re: [PATCH] Time profiler - phase 1

2013-11-12 Thread Rong Xu
A question about the newly added global variable function_counter.
Does it have to be a global? Can I make it a file static, within macro
L_gcov_time_profiler? I don't find a use other than in
__gcov_time_profiler().

thanks,

-Rong



On Mon, Nov 11, 2013 at 9:52 AM, Martin Liška marxin.li...@gmail.com wrote:
 On 11 November 2013 15:57, Jan Hubicka hubi...@ucw.cz wrote:
 +2013-10-29  Martin Liska  marxin.li...@gmail.com
 + Jan Hubicka  j...@suse.cz
 +
 + * cgraph.c (dump_cgraph_node): Profile dump added.
 + * cgraph.h (struct cgraph_node): New time profile variable added.
 + * cgraphclones.c (cgraph_clone_node): Time profile is cloned.
 + * gcov-io.h (gcov_type): New profiler type introduced.
 + * ipa-profile.c (lto_output_node): Streaming for time profile added.
 + (input_node): Time profiler is read from LTO stream.
 + * predict.c (maybe_hot_count_p): Hot prediction changed.
 + * profile.c (instrument_values): New case for time profiler added.
 + (compute_value_histograms): Read of time profile.
 + * tree-pretty-print.c (dump_function_header): Time profiler is dumped.
 + * tree-profile.c (init_ic_make_global_vars): Time profiler function 
 added.
 + (gimple_init_edge_profiler): TP function instrumentation.
 + (gimple_gen_time_profiler): New.
 + * value-prof.c (gimple_add_histogram_value): Support for time profiler
 + added.
 + (dump_histogram_value): TP type added to dumps.
 + (visit_hist): More sensitive check that takes TP into account.
 + (gimple_find_values_to_profile): TP instrumentation.
 + * value-prof.h (hist_type): New histogram type added.
 + (struct histogram_value_t): Pointer to struct function added.
 + * libgcc/Makefile.in: New GCOV merge function for TP added.
 + * libgcov.c: function_counter variable introduced.
 + (_gcov_merge_time_profile): New.
 + (_gcov_time_profiler): New.
 +
  2013-11-05  Steven Bosscher  ste...@gcc.gnu.org


 diff --git a/gcc/ipa-profile.c b/gcc/ipa-profile.c
 index 1260069..eff4b51 100644
 --- a/gcc/ipa-profile.c
 +++ b/gcc/ipa-profile.c
 @@ -465,6 +465,7 @@ ipa_propagate_frequency (struct cgraph_node *node)
if (d.maybe_unlikely_executed)
  {
node-frequency = NODE_FREQUENCY_UNLIKELY_EXECUTED;
 +  node-tp_first_run = 0;
if (dump_file)
   fprintf (dump_file, Node %s promoted to unlikely executed.\n,
cgraph_node_name (node));

 Why do you put tp_first_run to 0?  The function may be unlikely but it still 
 may have
 average possition in the program execution. I.e. when it is executed once 
 per many
 invocations during the train run)

 That makes no sense, I've removed this assignment.

 @@ -895,9 +907,22 @@ compute_value_histograms (histogram_values values, 
 unsigned cfg_checksum,
hist-hvalue.counters =  XNEWVEC (gcov_type, hist-n_counters);
for (j = 0; j  hist-n_counters; j++)
  if (aact_count)
 -   hist-hvalue.counters[j] = aact_count[j];
 - else
 -   hist-hvalue.counters[j] = 0;
 +  hist-hvalue.counters[j] = aact_count[j];
 +else
 +  hist-hvalue.counters[j] = 0;
 +
 +  /* Time profiler counter is not related to any statement,
 +   * so that we have to read the counter and set the value to
 +   * the corresponding call graph node.  */

 Reformat comment to GNU style.

 Yes.

 +2013-10-28  Martin Liska marxin.li...@gmail.com
 +
 + * gcc.dg/time-profiler-1.c: New test.
 + * gcc.dg/time-profiler-2.c: Ditto.
 +
 It is custom to put all changelogs to the beggining of your patch (just 
 nitpicking ;))
 @@ -222,6 +225,18 @@ gimple_init_edge_profiler (void)
   = tree_cons (get_identifier (leaf), NULL,
DECL_ATTRIBUTES (tree_indirect_call_profiler_fn));

 +  /* void (*) (gcov_type *, gcov_type, void *)  */
 +  time_profiler_fn_type
 += build_function_type_list (void_type_node,
 +   gcov_type_ptr, NULL_TREE);
 +  tree_time_profiler_fn
 +   = build_fn_decl (__gcov_time_profiler,
 +  time_profiler_fn_type);
 +  TREE_NOTHROW (tree_time_profiler_fn) = 1;
 +  DECL_ATTRIBUTES (tree_time_profiler_fn)
 + = tree_cons (get_identifier (leaf), NULL,
 +  DECL_ATTRIBUTES (tree_time_profiler_fn));

 We should udpate this code to use set_call_expr_flags but by incremental 
 patch.

 OK, I will include this change to the phase 2.

 The patch is OK with changes above.  Do you have write permissin to commit 
 it?
 If not, just send updated version and I will commit it + follow the write 
 access
 instructions on gcc.gnu.org homepage.

 Yes, I do have commit right. I will bootstrap the patch, test Inkscape
 instrumentation and commit it.

 Martin

 Honza


Re: [PATCH] Time profiler - phase 1

2013-11-12 Thread Rong Xu
Thanks. I'll have this change included in my libgcov.c re-re-factoring patch.

-Rong

On Tue, Nov 12, 2013 at 10:18 AM, Martin Liška marxin.li...@gmail.com wrote:
 Hi Rong,
   you are right, that's the only usage and so that you can declare it
 static.

 Thank you,
 Martin

 On Nov 12, 2013 7:10 PM, Rong Xu x...@google.com wrote:

 A question about the newly added global variable function_counter.
 Does it have to be a global? Can I make it a file static, within macro
 L_gcov_time_profiler? I don't find a use other than in
 __gcov_time_profiler().

 thanks,

 -Rong



 On Mon, Nov 11, 2013 at 9:52 AM, Martin Liška marxin.li...@gmail.com
 wrote:
  On 11 November 2013 15:57, Jan Hubicka hubi...@ucw.cz wrote:
  +2013-10-29  Martin Liska  marxin.li...@gmail.com
  + Jan Hubicka
  j...@suse.cz
  +
  + * cgraph.c (dump_cgraph_node): Profile dump added.
  + * cgraph.h (struct cgraph_node): New time profile variable
  added.
  + * cgraphclones.c (cgraph_clone_node): Time profile is cloned.
  + * gcov-io.h (gcov_type): New profiler type introduced.
  + * ipa-profile.c (lto_output_node): Streaming for time profile
  added.
  + (input_node): Time profiler is read from LTO stream.
  + * predict.c (maybe_hot_count_p): Hot prediction changed.
  + * profile.c (instrument_values): New case for time profiler
  added.
  + (compute_value_histograms): Read of time profile.
  + * tree-pretty-print.c (dump_function_header): Time profiler is
  dumped.
  + * tree-profile.c (init_ic_make_global_vars): Time profiler
  function added.
  + (gimple_init_edge_profiler): TP function instrumentation.
  + (gimple_gen_time_profiler): New.
  + * value-prof.c (gimple_add_histogram_value): Support for time
  profiler
  + added.
  + (dump_histogram_value): TP type added to dumps.
  + (visit_hist): More sensitive check that takes TP into account.
  + (gimple_find_values_to_profile): TP instrumentation.
  + * value-prof.h (hist_type): New histogram type added.
  + (struct histogram_value_t): Pointer to struct function added.
  + * libgcc/Makefile.in: New GCOV merge function for TP added.
  + * libgcov.c: function_counter variable introduced.
  + (_gcov_merge_time_profile): New.
  + (_gcov_time_profiler): New.
  +
   2013-11-05  Steven Bosscher  ste...@gcc.gnu.org
 
 
  diff --git a/gcc/ipa-profile.c b/gcc/ipa-profile.c
  index 1260069..eff4b51 100644
  --- a/gcc/ipa-profile.c
  +++ b/gcc/ipa-profile.c
  @@ -465,6 +465,7 @@ ipa_propagate_frequency (struct cgraph_node *node)
 if (d.maybe_unlikely_executed)
   {
 node-frequency = NODE_FREQUENCY_UNLIKELY_EXECUTED;
  +  node-tp_first_run = 0;
 if (dump_file)
fprintf (dump_file, Node %s promoted to unlikely executed.\n,
 cgraph_node_name (node));
 
  Why do you put tp_first_run to 0?  The function may be unlikely but it
  still may have
  average possition in the program execution. I.e. when it is executed
  once per many
  invocations during the train run)
 
  That makes no sense, I've removed this assignment.
 
  @@ -895,9 +907,22 @@ compute_value_histograms (histogram_values
  values, unsigned cfg_checksum,
 hist-hvalue.counters =  XNEWVEC (gcov_type, hist-n_counters);
 for (j = 0; j  hist-n_counters; j++)
   if (aact_count)
  -   hist-hvalue.counters[j] = aact_count[j];
  - else
  -   hist-hvalue.counters[j] = 0;
  +  hist-hvalue.counters[j] = aact_count[j];
  +else
  +  hist-hvalue.counters[j] = 0;
  +
  +  /* Time profiler counter is not related to any statement,
  +   * so that we have to read the counter and set the value to
  +   * the corresponding call graph node.  */
 
  Reformat comment to GNU style.
 
  Yes.
 
  +2013-10-28  Martin Liska marxin.li...@gmail.com
  +
  + * gcc.dg/time-profiler-1.c: New test.
  + * gcc.dg/time-profiler-2.c: Ditto.
  +
  It is custom to put all changelogs to the beggining of your patch (just
  nitpicking ;))
  @@ -222,6 +225,18 @@ gimple_init_edge_profiler (void)
= tree_cons (get_identifier (leaf), NULL,
 DECL_ATTRIBUTES (tree_indirect_call_profiler_fn));
 
  +  /* void (*) (gcov_type *, gcov_type, void *)  */
  +  time_profiler_fn_type
  += build_function_type_list (void_type_node,
  +   gcov_type_ptr, NULL_TREE);
  +  tree_time_profiler_fn
  +   = build_fn_decl (__gcov_time_profiler,
  +  time_profiler_fn_type);
  +  TREE_NOTHROW (tree_time_profiler_fn) = 1;
  +  DECL_ATTRIBUTES (tree_time_profiler_fn)
  + = tree_cons (get_identifier (leaf), NULL,
  +  DECL_ATTRIBUTES (tree_time_profiler_fn));
 
  We should udpate this code to use set_call_expr_flags but by
  incremental patch.
 
  OK, I will include this change to the phase 2

Re: [RFC] libgcov.c re-factoring and offline profile-tool

2013-11-12 Thread Rong Xu
The patch was checked in as r204730.

Tested with
configure with --enable-languages=all,obj-c++
(ada is currently broken, so I was not be able test)
bootstrap and profiledbootstrap
regression for -m32 and -m64.
spec2006

The only semantic change is __gcov_flush() used to be under L_gcov.
I put the function to libgcov-interface.c and made it under
L_gcov_flush (newly added).
So if you need this function, you have to enable L_gcov_flush in the
libgcc/Makefile.in.

Thanks,

-Rong

On Mon, Nov 11, 2013 at 10:19 AM, Rong Xu x...@google.com wrote:
 On Mon, Nov 11, 2013 at 7:02 AM, Jan Hubicka hubi...@ucw.cz wrote:
 2013-11-04  Rong Xu  x...@google.com

   * libgcc/libgcov.c: Delete as part of re-factoring.
   * libgcc/libgcov-profiler.c (__gcov_interval_profiler): Moved from
   libgcov.c
   (__gcov_pow2_profiler): Ditto.
   (__gcov_one_value_profiler_body): Ditto.
   (__gcov_one_value_profiler): Ditto.
   (__gcov_indirect_call_profiler): Ditto.
   (__gcov_indirect_call_profiler_v2): Ditto.
   (__gcov_average_profiler): Ditto.
   (__gcov_ior_profiler): Ditto.
   * libgcc/libgcov-driver.c (this_prg): Make it file scope
   static variable.
   (all_prg): Ditto.
   (crc32): Ditto.
   (gi_filename): Ditto.
   (fn_buffer): Ditto.
   (sum_buffer): Ditto.
 (struct gcov_filename_aux): New types to store auxiliary information
 for gi_filename.
   (gcov_version): Moved from libgcov.c.
   (crc32_unsigned): Ditto.
   (gcov_histogram_insert): Ditto.
   (gcov_compute_histogram): Ditto.
   (gcov_exit): Ditto.
   (gcov_clear): Ditto.
   (__gcov_init): Ditto.
   (gcov_exit_compute_summary): New function split from gcov_exit().
   (gcov_exit_merge_gcda): Ditto.
   (gcov_exit_write_gcda): Ditto.
   (gcov_exit_dump_gcov): Ditto.
   * libgcc/libgcov-interface.c (init_mx): Moved from libgcov.c.
   (init_mx_once): Ditto.
   (__gcov_flush): Ditto.
   (__gcov_reset): Ditto.
   (__gcov_dump): Ditto.
   (__gcov_fork): Ditto.
   (__gcov_execl): Ditto.
   (__gcov_execlp): Ditto.
   (__gcov_execle): Ditto.
   (__gcov_execv): Ditto.
   (__gcov_execvp): Ditto.
   (__gcov_execve): Ditto.
   * libgcc/libgcov-driver-system.c (gcov_error): New utility function.
   (allocate_filename_struct): New function split from gcov_exit().
   (gcov_exit_open_gcda_file): Ditto.
 (create_file_directory): Moved from libgcov.c.
   * libgcc/libgcov-merge.c:
   (__gcov_merge_add): Moved from libgcov.c.
   (__gcov_merge_ior): Ditto.
   (__gcov_merge_single): Ditto.
   (__gcov_merge_delta): Ditto.
   * libgcc/Makefile.in: Change to build newly added files.
   * gcc/gcov-io.h (__gcov_merge_ior): Add the decl to avoid warning.


 Hi,
 the patch looks resonable. I take your word on the fact that there are no 
 changes
 in the code, I did not read it all ;)))

 We did split gcov_exit() into more functions.
 But, semantically the code is the same.

 Index: libgcc/Makefile.in
 ===
 --- libgcc/Makefile.in(revision 204285)
 +++ libgcc/Makefile.in(working copy)
 @@ -853,17 +853,37 @@ include $(iterator)
  # Build libgcov components.

  # Defined in libgcov.c, included only in gcov library
 -LIBGCOV = _gcov _gcov_merge_add _gcov_merge_single _gcov_merge_delta \
 -_gcov_fork _gcov_execl _gcov_execlp _gcov_execle \
 -_gcov_execv _gcov_execvp _gcov_execve _gcov_reset _gcov_dump \
 -_gcov_interval_profiler _gcov_pow2_profiler _gcov_one_value_profiler \
 +##LIBGCOV = _gcov _gcov_merge_add _gcov_merge_single _gcov_merge_delta \
 +##_gcov_fork _gcov_execl _gcov_execlp _gcov_execle \
 +##_gcov_execv _gcov_execvp _gcov_execve _gcov_reset _gcov_dump \
 +##_gcov_interval_profiler _gcov_pow2_profiler _gcov_one_value_profiler 
 \
 +##_gcov_indirect_call_profiler _gcov_average_profiler 
 _gcov_ior_profiler \
 +##_gcov_merge_ior _gcov_indirect_call_profiler_v2

 Probably no need for this commnet block.

 Yes. I ready remove in the most recently patch.

 +
 +LIBGCOV_MERGE = _gcov_merge_add _gcov_merge_single _gcov_merge_delta 
 _gcov_merge_ior
 +LIBGCOV_PROFILER = _gcov_interval_profiler _gcov_pow2_profiler 
 _gcov_one_value_profiler \
  _gcov_indirect_call_profiler _gcov_average_profiler _gcov_ior_profiler 
 \
 -_gcov_merge_ior _gcov_indirect_call_profiler_v2
 +_gcov_indirect_call_profiler_v2
 +LIBGCOV_INTERFACE = _gcov_flush _gcov_fork _gcov_execl _gcov_execlp 
 _gcov_execle \
 +_gcov_execv _gcov_execvp _gcov_execve _gcov_reset _gcov_dump
 +LIBGCOV_DRIVER = _gcov

 -libgcov-objects = $(patsubst %,%$(objext),$(LIBGCOV))
 +libgcov-merge-objects = $(patsubst %,%$(objext),$(LIBGCOV_MERGE))
 +libgcov-profiler-objects = $(patsubst %,%$(objext),$(LIBGCOV_PROFILER))
 +libgcov-interface-objects = $(patsubst %,%$(objext),$(LIBGCOV_INTERFACE))
 +libgcov

Re: [RFC] libgcov.c re-factoring and offline profile-tool

2013-11-11 Thread Rong Xu
On Mon, Nov 11, 2013 at 7:02 AM, Jan Hubicka hubi...@ucw.cz wrote:
 2013-11-04  Rong Xu  x...@google.com

   * libgcc/libgcov.c: Delete as part of re-factoring.
   * libgcc/libgcov-profiler.c (__gcov_interval_profiler): Moved from
   libgcov.c
   (__gcov_pow2_profiler): Ditto.
   (__gcov_one_value_profiler_body): Ditto.
   (__gcov_one_value_profiler): Ditto.
   (__gcov_indirect_call_profiler): Ditto.
   (__gcov_indirect_call_profiler_v2): Ditto.
   (__gcov_average_profiler): Ditto.
   (__gcov_ior_profiler): Ditto.
   * libgcc/libgcov-driver.c (this_prg): Make it file scope
   static variable.
   (all_prg): Ditto.
   (crc32): Ditto.
   (gi_filename): Ditto.
   (fn_buffer): Ditto.
   (sum_buffer): Ditto.
 (struct gcov_filename_aux): New types to store auxiliary information
 for gi_filename.
   (gcov_version): Moved from libgcov.c.
   (crc32_unsigned): Ditto.
   (gcov_histogram_insert): Ditto.
   (gcov_compute_histogram): Ditto.
   (gcov_exit): Ditto.
   (gcov_clear): Ditto.
   (__gcov_init): Ditto.
   (gcov_exit_compute_summary): New function split from gcov_exit().
   (gcov_exit_merge_gcda): Ditto.
   (gcov_exit_write_gcda): Ditto.
   (gcov_exit_dump_gcov): Ditto.
   * libgcc/libgcov-interface.c (init_mx): Moved from libgcov.c.
   (init_mx_once): Ditto.
   (__gcov_flush): Ditto.
   (__gcov_reset): Ditto.
   (__gcov_dump): Ditto.
   (__gcov_fork): Ditto.
   (__gcov_execl): Ditto.
   (__gcov_execlp): Ditto.
   (__gcov_execle): Ditto.
   (__gcov_execv): Ditto.
   (__gcov_execvp): Ditto.
   (__gcov_execve): Ditto.
   * libgcc/libgcov-driver-system.c (gcov_error): New utility function.
   (allocate_filename_struct): New function split from gcov_exit().
   (gcov_exit_open_gcda_file): Ditto.
 (create_file_directory): Moved from libgcov.c.
   * libgcc/libgcov-merge.c:
   (__gcov_merge_add): Moved from libgcov.c.
   (__gcov_merge_ior): Ditto.
   (__gcov_merge_single): Ditto.
   (__gcov_merge_delta): Ditto.
   * libgcc/Makefile.in: Change to build newly added files.
   * gcc/gcov-io.h (__gcov_merge_ior): Add the decl to avoid warning.


 Hi,
 the patch looks resonable. I take your word on the fact that there are no 
 changes
 in the code, I did not read it all ;)))

We did split gcov_exit() into more functions.
But, semantically the code is the same.

 Index: libgcc/Makefile.in
 ===
 --- libgcc/Makefile.in(revision 204285)
 +++ libgcc/Makefile.in(working copy)
 @@ -853,17 +853,37 @@ include $(iterator)
  # Build libgcov components.

  # Defined in libgcov.c, included only in gcov library
 -LIBGCOV = _gcov _gcov_merge_add _gcov_merge_single _gcov_merge_delta \
 -_gcov_fork _gcov_execl _gcov_execlp _gcov_execle \
 -_gcov_execv _gcov_execvp _gcov_execve _gcov_reset _gcov_dump \
 -_gcov_interval_profiler _gcov_pow2_profiler _gcov_one_value_profiler \
 +##LIBGCOV = _gcov _gcov_merge_add _gcov_merge_single _gcov_merge_delta \
 +##_gcov_fork _gcov_execl _gcov_execlp _gcov_execle \
 +##_gcov_execv _gcov_execvp _gcov_execve _gcov_reset _gcov_dump \
 +##_gcov_interval_profiler _gcov_pow2_profiler _gcov_one_value_profiler \
 +##_gcov_indirect_call_profiler _gcov_average_profiler 
 _gcov_ior_profiler \
 +##_gcov_merge_ior _gcov_indirect_call_profiler_v2

 Probably no need for this commnet block.

Yes. I ready remove in the most recently patch.

 +
 +LIBGCOV_MERGE = _gcov_merge_add _gcov_merge_single _gcov_merge_delta 
 _gcov_merge_ior
 +LIBGCOV_PROFILER = _gcov_interval_profiler _gcov_pow2_profiler 
 _gcov_one_value_profiler \
  _gcov_indirect_call_profiler _gcov_average_profiler _gcov_ior_profiler \
 -_gcov_merge_ior _gcov_indirect_call_profiler_v2
 +_gcov_indirect_call_profiler_v2
 +LIBGCOV_INTERFACE = _gcov_flush _gcov_fork _gcov_execl _gcov_execlp 
 _gcov_execle \
 +_gcov_execv _gcov_execvp _gcov_execve _gcov_reset _gcov_dump
 +LIBGCOV_DRIVER = _gcov

 -libgcov-objects = $(patsubst %,%$(objext),$(LIBGCOV))
 +libgcov-merge-objects = $(patsubst %,%$(objext),$(LIBGCOV_MERGE))
 +libgcov-profiler-objects = $(patsubst %,%$(objext),$(LIBGCOV_PROFILER))
 +libgcov-interface-objects = $(patsubst %,%$(objext),$(LIBGCOV_INTERFACE))
 +libgcov-driver-objects = $(patsubst %,%$(objext),$(LIBGCOV_DRIVER))
 +libgcov-objects = $(libgcov-merge-objects) $(libgcov-profiler-objects) \
 + $(libgcov-interface-objects) $(libgcov-driver-objects)

 -$(libgcov-objects): %$(objext): $(srcdir)/libgcov.c
 - $(gcc_compile) -DL$* -c $(srcdir)/libgcov.c
 +$(libgcov-merge-objects): %$(objext): $(srcdir)/libgcov-merge.c
 + $(gcc_compile) -DL$* -c $(srcdir)/libgcov-merge.c
 +$(libgcov-profiler-objects): %$(objext): $(srcdir)/libgcov-profiler.c
 + $(gcc_compile) -DL$* -c $(srcdir)/libgcov

Re: [RFC] libgcov.c re-factoring and offline profile-tool

2013-11-11 Thread Rong Xu
On Mon, Nov 11, 2013 at 7:17 AM, Jan Hubicka hubi...@ucw.cz wrote:
 Here is the patch that includes profile-tool.
 Profile-tool now has two functions: merge and rewrite. I'll add diff later.

 Compiler is tested with spec2006 and profiledbootstrap.
 profile-tool is tested with spec2006 profiles.

 Hi,
 it would be nice if you could elaborate bit more on the tool and what it does.
 I plan to implement relink only instrumentatino for LTO and for that I will 
 need
 profile reading/handling that is independent on functio nbodies, so perhaps
 we can unify the infrastructure.  My plans always was to merge the gcov and
 profile.c profile handling logic and make it bit more generic.

This tool is designed to manipulate raw profile counters. We plan to
have the following functionality:

(1) merge the gcda file with weights. We have fairly complicated build
system. Sometime online merging in
current libgcov.a not feasible (or not desirable). We need offline
tool the merge the FDO profiles.
The program may have multiple training inputs with different
importance -- this is where weighted merging
comes from.

(2) diff the profile. We want to know how different of two profile
data. Some tolerance of mismatch is required here.
This is crucial for performance triaging.

(3) rewrite the gcda file. We can do some (user-direct) fix-up work in
the rewrite. This is may not be important for FDO. But
this will be very useful for LIPO (like fix-up the indirect-target due
to mismatch, rewrite module-grouping)

(4) better dumping: like dump top-n-hottest
functions/objects/indirectly-targets etc.

This tool reconstructs the gcov_info list from gcda files and then
uses significant part of libgcov.c code:
merge, compute summary/histo-gram, and dump gcda files etc.

The merging into current libgcov can also use this piece of code to do
the in-memory merge. The advantage
here is we will have precise histro-gram.


 Also I think profile-tool is a bit generic name. I.e. it does not realy say it
 is related to GCC's coverage profiling.  Maybe gcc-profile-tool or
 gcov-profile-tool may be better?

Yes. gcov-profile-tool seems to be better, or simply gcov-tool?

 Perhaps an resonable option would also be to bundle it into gcov binary...

Bundling into gcov is an interesting idea. I do plan to read gcno file
to extract more information (at least the function name).
Bungling into gcda means the tool can work on bb level, rather on the
raw counters. Let me think a bit more on this.

 Index: libgcc/libgcov-tool.c
 ===
 --- libgcc/libgcov-tool.c (revision 0)
 +++ libgcc/libgcov-tool.c (revision 0)
 @@ -0,0 +1,800 @@
 +/* GCC instrumentation plugin for ThreadSanitizer.
 +   Copyright (C) 2011-2013 Free Software Foundation, Inc.
 +   Contributed by Dmitry Vyukov dvyu...@google.com

 You need to update this...
 Why there needs to be libgcov changes?

David pointed out the copyright issue in an earlier email. I fixed it.

The changes in libgcov are
(1) ifdef the target headers. I need all the libgcov functionality to
read/write the gcda files, as well as summary and histo-gram. But the
binary is for host machines.
(2) change the merge function so that I can take an in-memory
gcov-info as input.


 Honza


Re: [RFC] libgcov.c re-factoring and offline profile-tool

2013-11-08 Thread Rong Xu
On Fri, Nov 8, 2013 at 10:22 AM, Xinliang David Li davi...@google.com wrote:
 in libgcov-driver.c

 /* Flag when the profile has already been dumped via __gcov_dump().  */
 static int gcov_dump_complete;

 inline void
 set_gcov_dump_complete (void)
{
gcov_dump_complete = 1;
}

inline void
reset_gcov_dump_complete (void)
{
  gcov_dump_complete = 0;
}


 These should be moved to libgcov-interface.c. Is there reason to not
 mark them as static?

gcov_dump_complete is used in gcov_exit() which is in
libgcov-driver.c, but it's set in __gcov_dump and __gcov_reset, both
in libgcov-interface.c.
I want gcov_dump_complete a static. So I have to add to global
functions that accessible from libgcov-interface.c.
Another choice is to move __gcov_dump and __gcov_reset to
libgcov-driver.c and that will simplify the code.


 in libgcov-profiler.c, line 142

 #if defined(HAVE_CC_TLS)  !defined (USE_EMUTLS)
 __thread

 trailing whilte space.


Will fix.


 || (VTABLE_USES_DESCRIPTORS  __gcov_indirect_call_callee
  *(void **) cur_func == *(void **) __gcov_indirect_call_callee))

 trailing white space.


Will fix.


 in libgcov-merge.c

 1) I don't think you need in_mem flag. For in memory merge, the source != 
 NULL.
 2) document the new source and weight parameter -- especially the weight.

Will do.

 3) How do you deal with integer overflow ?

This is good question. gcvo_type is (typically) long long. I thought
the count value will not be nearly close to the max of long long.
(We did see overflow in compiler, but that's because of the scaling --
some of the scaling factors are really large.)


 David






 On Thu, Nov 7, 2013 at 3:34 PM, Rong Xu x...@google.com wrote:
 On Thu, Nov 7, 2013 at 9:40 AM, Joseph S. Myers jos...@codesourcery.com 
 wrote:
 On Thu, 7 Nov 2013, Rong Xu wrote:

 Thanks Joseph for these detailed comments/suggestions.
 The fixed patch is attached to this email.
 The only thing left out is the Texinfo manual. Do you mean this tool
 should have its it's own texi file in gcc/doc?

 Its own texi file, probably included as a chapter by gcc.texi (like
 gcov.texi is).

 OK. will add this in.

 My last patch that changes the command handling is actually broken.
 Please ignore that patch and use the one in this email.

 Also did some minor adjust of the code in libgcov.

 Thanks,

 -Rong



 --
 Joseph S. Myers
 jos...@codesourcery.com


Re: [RFC] libgcov.c re-factoring and offline profile-tool

2013-11-08 Thread Rong Xu
A question about inhibit_libc.
When inhibit_libc is defined, we provide dummy functions for all the
__gcov_* methods.
Is it purposely to minimize the footprint?
I'm thinking to allow some codes that are independent of libc (and its
headers) under this. Is it OK?

-Rong

On Fri, Nov 8, 2013 at 10:48 AM, Rong Xu x...@google.com wrote:
 On Fri, Nov 8, 2013 at 10:22 AM, Xinliang David Li davi...@google.com wrote:
 in libgcov-driver.c

 /* Flag when the profile has already been dumped via __gcov_dump().  */
 static int gcov_dump_complete;

 inline void
 set_gcov_dump_complete (void)
{
gcov_dump_complete = 1;
}

inline void
reset_gcov_dump_complete (void)
{
  gcov_dump_complete = 0;
}


 These should be moved to libgcov-interface.c. Is there reason to not
 mark them as static?

 gcov_dump_complete is used in gcov_exit() which is in
 libgcov-driver.c, but it's set in __gcov_dump and __gcov_reset, both
 in libgcov-interface.c.
 I want gcov_dump_complete a static. So I have to add to global
 functions that accessible from libgcov-interface.c.
 Another choice is to move __gcov_dump and __gcov_reset to
 libgcov-driver.c and that will simplify the code.


 in libgcov-profiler.c, line 142

 #if defined(HAVE_CC_TLS)  !defined (USE_EMUTLS)
 __thread

 trailing whilte space.


 Will fix.


 || (VTABLE_USES_DESCRIPTORS  __gcov_indirect_call_callee
  *(void **) cur_func == *(void **) __gcov_indirect_call_callee))

 trailing white space.


 Will fix.


 in libgcov-merge.c

 1) I don't think you need in_mem flag. For in memory merge, the source != 
 NULL.
 2) document the new source and weight parameter -- especially the weight.

 Will do.

 3) How do you deal with integer overflow ?

 This is good question. gcvo_type is (typically) long long. I thought
 the count value will not be nearly close to the max of long long.
 (We did see overflow in compiler, but that's because of the scaling --
 some of the scaling factors are really large.)


 David






 On Thu, Nov 7, 2013 at 3:34 PM, Rong Xu x...@google.com wrote:
 On Thu, Nov 7, 2013 at 9:40 AM, Joseph S. Myers jos...@codesourcery.com 
 wrote:
 On Thu, 7 Nov 2013, Rong Xu wrote:

 Thanks Joseph for these detailed comments/suggestions.
 The fixed patch is attached to this email.
 The only thing left out is the Texinfo manual. Do you mean this tool
 should have its it's own texi file in gcc/doc?

 Its own texi file, probably included as a chapter by gcc.texi (like
 gcov.texi is).

 OK. will add this in.

 My last patch that changes the command handling is actually broken.
 Please ignore that patch and use the one in this email.

 Also did some minor adjust of the code in libgcov.

 Thanks,

 -Rong



 --
 Joseph S. Myers
 jos...@codesourcery.com


Re: [RFC] libgcov.c re-factoring and offline profile-tool

2013-11-08 Thread Rong Xu
On Fri, Nov 8, 2013 at 11:02 AM, Xinliang David Li davi...@google.com wrote:
 On Fri, Nov 8, 2013 at 10:48 AM, Rong Xu x...@google.com wrote:
 On Fri, Nov 8, 2013 at 10:22 AM, Xinliang David Li davi...@google.com 
 wrote:
 in libgcov-driver.c

 /* Flag when the profile has already been dumped via __gcov_dump().  */
 static int gcov_dump_complete;

 inline void
 set_gcov_dump_complete (void)
{
gcov_dump_complete = 1;
}

inline void
reset_gcov_dump_complete (void)
{
  gcov_dump_complete = 0;
}


 These should be moved to libgcov-interface.c. Is there reason to not
 mark them as static?

 gcov_dump_complete is used in gcov_exit() which is in
 libgcov-driver.c, but it's set in __gcov_dump and __gcov_reset, both
 in libgcov-interface.c.
 I want gcov_dump_complete a static. So I have to add to global
 functions that accessible from libgcov-interface.c.
 Another choice is to move __gcov_dump and __gcov_reset to
 libgcov-driver.c and that will simplify the code.


 Ok  then you should remove the 'inline' keyword for the set and reset
 functions and keep them in libgcov-driver.c.

Will remove 'inline' keyword.



 in libgcov-profiler.c, line 142

 #if defined(HAVE_CC_TLS)  !defined (USE_EMUTLS)
 __thread

 trailing whilte space.


 Will fix.


 || (VTABLE_USES_DESCRIPTORS  __gcov_indirect_call_callee
  *(void **) cur_func == *(void **) __gcov_indirect_call_callee))

 trailing white space.


 Will fix.


 in libgcov-merge.c

 1) I don't think you need in_mem flag. For in memory merge, the source != 
 NULL.
 2) document the new source and weight parameter -- especially the weight.

 Will do.

 3) How do you deal with integer overflow ?

 This is good question. gcvo_type is (typically) long long. I thought
 the count value will not be nearly close to the max of long long.
 (We did see overflow in compiler, but that's because of the scaling --
 some of the scaling factors are really large.)


 Why not passing weight as a scaled PERCENT  (as branch prob) for the
 source? THis way, the merge tool does not need to do scaling twice.


there are two weights, so unless you have two weights in the merge_add
functions, you have to
traverse the list twice. Do you mean pass addition parameters?

Currently, merge tools is done this way:
merge (p1, p2, w1, w2)
first pass: merge_add (p1, p1, w1)
second pass: merge_add (p1, p2, w2)

I initially had both weights in merge_add function. but later I found that
to deal with mis-match profile, I need to find out the gcov_info in p1
but not p2, (they need to multiply by w1 only).
So I choose the above structure.

Another thing about the PERCENT is the inaccuracy for floating points.

I have the scaling function in rewrite sub-command mainly for debug purpose:
I merge the same profile with a weight 1:9.
Then I scale the result profile by 0.1. I was expecting identical
profile as the source. But due to floating point inaccuracy, some
counters are off slightly.


 David




 David






 On Thu, Nov 7, 2013 at 3:34 PM, Rong Xu x...@google.com wrote:
 On Thu, Nov 7, 2013 at 9:40 AM, Joseph S. Myers jos...@codesourcery.com 
 wrote:
 On Thu, 7 Nov 2013, Rong Xu wrote:

 Thanks Joseph for these detailed comments/suggestions.
 The fixed patch is attached to this email.
 The only thing left out is the Texinfo manual. Do you mean this tool
 should have its it's own texi file in gcc/doc?

 Its own texi file, probably included as a chapter by gcc.texi (like
 gcov.texi is).

 OK. will add this in.

 My last patch that changes the command handling is actually broken.
 Please ignore that patch and use the one in this email.

 Also did some minor adjust of the code in libgcov.

 Thanks,

 -Rong



 --
 Joseph S. Myers
 jos...@codesourcery.com


Re: [RFC] libgcov.c re-factoring and offline profile-tool

2013-11-08 Thread Rong Xu
On Fri, Nov 8, 2013 at 11:30 AM, Xinliang David Li davi...@google.com wrote:
 On Fri, Nov 8, 2013 at 11:21 AM, Rong Xu x...@google.com wrote:
 On Fri, Nov 8, 2013 at 11:02 AM, Xinliang David Li davi...@google.com 
 wrote:
 On Fri, Nov 8, 2013 at 10:48 AM, Rong Xu x...@google.com wrote:
 On Fri, Nov 8, 2013 at 10:22 AM, Xinliang David Li davi...@google.com 
 wrote:
 in libgcov-driver.c

 /* Flag when the profile has already been dumped via __gcov_dump().  */
 static int gcov_dump_complete;

 inline void
 set_gcov_dump_complete (void)
{
gcov_dump_complete = 1;
}

inline void
reset_gcov_dump_complete (void)
{
  gcov_dump_complete = 0;
}


 These should be moved to libgcov-interface.c. Is there reason to not
 mark them as static?

 gcov_dump_complete is used in gcov_exit() which is in
 libgcov-driver.c, but it's set in __gcov_dump and __gcov_reset, both
 in libgcov-interface.c.
 I want gcov_dump_complete a static. So I have to add to global
 functions that accessible from libgcov-interface.c.
 Another choice is to move __gcov_dump and __gcov_reset to
 libgcov-driver.c and that will simplify the code.


 Ok  then you should remove the 'inline' keyword for the set and reset
 functions and keep them in libgcov-driver.c.

 Will remove 'inline' keyword.



 in libgcov-profiler.c, line 142

 #if defined(HAVE_CC_TLS)  !defined (USE_EMUTLS)
 __thread

 trailing whilte space.


 Will fix.


 || (VTABLE_USES_DESCRIPTORS  __gcov_indirect_call_callee
  *(void **) cur_func == *(void **) 
 __gcov_indirect_call_callee))

 trailing white space.


 Will fix.


 in libgcov-merge.c

 1) I don't think you need in_mem flag. For in memory merge, the source != 
 NULL.
 2) document the new source and weight parameter -- especially the weight.

 Will do.

 3) How do you deal with integer overflow ?

 This is good question. gcvo_type is (typically) long long. I thought
 the count value will not be nearly close to the max of long long.
 (We did see overflow in compiler, but that's because of the scaling --
 some of the scaling factors are really large.)


 Why not passing weight as a scaled PERCENT  (as branch prob) for the
 source? THis way, the merge tool does not need to do scaling twice.


 there are two weights, so unless you have two weights in the merge_add
 functions, you have to
 traverse the list twice. Do you mean pass addition parameters?

 Currently, merge tools is done this way:
 merge (p1, p2, w1, w2)
 first pass: merge_add (p1, p1, w1)
 second pass: merge_add (p1, p2, w2)


 Only need to pass the normalized the weight (in percent) for one
 profile source -- say norm_w2 : w2/(w1+w2), and the merge function can
 do this in one pass as norm_w1 = 1-norm_w2.


 I initially had both weights in merge_add function. but later I found that
 to deal with mis-match profile, I need to find out the gcov_info in p1
 but not p2, (they need to multiply by w1 only).
 So I choose the above structure.

 Another thing about the PERCENT is the inaccuracy for floating points.

 This is how profile update work in profile-use compilation -- so that
 should not be a big problem.

 Before you fix this, I'd like to hear what Honza and other reviewers think.

 thanks,

 David


OK. Let's get more comments on this before I re-work on this.
I also want to point out that the traversal the gcov-info list is
cheap. Even with the huge profile for google internal programs. The
majority of the time of the merge-tool is in read/write gcda files..

-Rong


 I have the scaling function in rewrite sub-command mainly for debug purpose:
 I merge the same profile with a weight 1:9.
 Then I scale the result profile by 0.1. I was expecting identical
 profile as the source. But due to floating point inaccuracy, some
 counters are off slightly.


 David




 David






 On Thu, Nov 7, 2013 at 3:34 PM, Rong Xu x...@google.com wrote:
 On Thu, Nov 7, 2013 at 9:40 AM, Joseph S. Myers 
 jos...@codesourcery.com wrote:
 On Thu, 7 Nov 2013, Rong Xu wrote:

 Thanks Joseph for these detailed comments/suggestions.
 The fixed patch is attached to this email.
 The only thing left out is the Texinfo manual. Do you mean this tool
 should have its it's own texi file in gcc/doc?

 Its own texi file, probably included as a chapter by gcc.texi (like
 gcov.texi is).

 OK. will add this in.

 My last patch that changes the command handling is actually broken.
 Please ignore that patch and use the one in this email.

 Also did some minor adjust of the code in libgcov.

 Thanks,

 -Rong



 --
 Joseph S. Myers
 jos...@codesourcery.com


Re: [RFC] libgcov.c re-factoring and offline profile-tool

2013-11-04 Thread Rong Xu
BTW, this part of patch only does the re-factoring work.
I haven't finished the porting of profile-tool part of patch to the
trunk. They should be in a separate patch anyway.

-Rong

On Mon, Nov 4, 2013 at 2:43 PM, Rong Xu x...@google.com wrote:
 Honza, Thanks for the comments!
 Attached is the patch.
 Passed spec2006 fdo build and profiledbootstrap in gcc.
 I'm doing some other tests.

 -Rong

 On Mon, Nov 4, 2013 at 1:55 AM, Jan Hubicka hubi...@ucw.cz wrote:
 Thanks! Can you attach the patch file for the proposed change?

 David

 On Fri, Nov 1, 2013 at 2:08 PM, Rong Xu x...@google.com wrote:
  libgcov.c is the main file to generate libgcov.a. This single source 
  generates
  21 object files and then archives to a library. These objects are of
  very different purposes but they are in the same file guarded by various 
  macros.
  The source file becomes quite large and its readability becomes very
  low. In its current state:
  1) The code is very hard to understand by new developers;

 Agreed, libgcov has became hard to maintain since several parts are written
 in very low-level way.
  2) It makes regressions more likely when making simple changes;
  More importantly,
  3) it makes code reuse/sharing very difficult. For instance, Using
  libgcov to implement an offline profile tool (see below); kernel FDO
  support etc.

 Yep, it was my longer term plan to do this, too.
 
  We came to this issue when working on an offline tool to handle
  profile counters.
  Our plan is to have a profile-tool can merge, diff, collect statistics, 
  and
  better dump raw profile counters. This is one of the most requested 
  features
  from out internal users. This tool can also be very useful for any 
  FDO/coverage
  users. In the first part of tool, we want to have the weight
  merge. Again, to reuse the code, we have to change libgcov.c. But we 
  don't want
  to further diverge libgcov.a in our google branches from the trunk.
 
  Another issue in libgcov.c is function gcov_exit(). It is a huge function
  with about 467 lines of code. This function traverses gcov_list and dumps 
  all
  the gcda files. The code for merging the gcda files also sits in the same
   function. The structure makes the code-reuse and extension really 
  difficult
  (like in kernel FDO, we have to break this function to reuse some of the 
  code).
 
  We propose to break gcov_exit into smaller functions and split libgcov.c 
  into
  several files. Our plan for profile-tool is listed in (3).
 
  1. break gcov_exit() into the following structure:
 gcov_exit()
-- gcov_exit_compute_summary()
-- allocate_filename_struct()
for gi_ptr in gcov_list
  -- gcov_exit_dump_gcov()
 -- gcov_exit_open_gcda_file()
 -- gcov_exit_merge_gcda ()
 -- gcov_exit_write_gcda ()

 Just a short comment that summaries are also produced during the merging - 
 i.e. they are
 done on per-file basis. But otherwise I agree.
 We should be cureful to not increase footprint of libgcov much for embedded 
 users, but
 I believe changes like this can be done w/o additional overhead in the 
 optimized library.

 The merge of summary has two parts. One is to find the summary to
 merge to. That is in gcov_exit_merge_gcda().
 The other part of do the actual merging of summary is in 
 gcov_exit_dump_gcov().

 I tried to keep the current work flow. So except for a few more calls,
 I don't think there is much overhead.

 
  2. split libgcov.c into the following files:
   libgcov-profiler.c
   libgcov-merge.c
   libgcov-interface.c
   libgcov-driver.c
 libgcov-driver-system.c (source included into libgcov-driver.c)

 Seems resonable.  Splitting i/o stuff into separate module (driver) is 
 something I planned
 for long time.  What is difference in between libgcov-interface and 
 libgcov-driver?

 Basically libgcov-driver handles all the dumping of gcov-info.
 libgcov-interface is the API used by instrumentation, like the wrapper for 
 fork.


 
  The details of the splitting:
  libgcov-interface.c
  /* This file contains API functions to other programs and the supporting
 functions.  */
__gcov_dump()
__gcov_execl()
__gcov_execle()
__gcov_execlp()
__gcov_execv()
__gcov_execve()
__gcov_execvp()
__gcov_flush()
__gcov_fork()
__gcov_reset()
init_mx()
init_mx_once()
 
  libgcov-profiler.c
  /* This file contains runtime profile handlers.  */
variables:
  __gcov_indirect_call_callee
  __gcov_indirect_call_counters
functions:
  __gcov_average_profiler()
  __gcov_indirect_call_profiler()
  __gcov_indirect_call_profiler_v2()
  __gcov_interval_profiler()
  __gcov_ior_profiler()
  __gcov_one_value_profiler()
  __gcov_one_value_profiler_body()
  __gcov_pow2_profiler()
 
  libgcov-merge.c
  /* This file contains the merge functions for various counters.  */
functions

[RFC] libgcov.c re-factoring and offline profile-tool

2013-11-01 Thread Rong Xu
libgcov.c is the main file to generate libgcov.a. This single source generates
21 object files and then archives to a library. These objects are of
very different purposes but they are in the same file guarded by various macros.
The source file becomes quite large and its readability becomes very
low. In its current state:
1) The code is very hard to understand by new developers;
2) It makes regressions more likely when making simple changes;
More importantly,
3) it makes code reuse/sharing very difficult. For instance, Using
libgcov to implement an offline profile tool (see below); kernel FDO
support etc.

We came to this issue when working on an offline tool to handle
profile counters.
Our plan is to have a profile-tool can merge, diff, collect statistics, and
better dump raw profile counters. This is one of the most requested features
from out internal users. This tool can also be very useful for any FDO/coverage
users. In the first part of tool, we want to have the weight
merge. Again, to reuse the code, we have to change libgcov.c. But we don't want
to further diverge libgcov.a in our google branches from the trunk.

Another issue in libgcov.c is function gcov_exit(). It is a huge function
with about 467 lines of code. This function traverses gcov_list and dumps all
the gcda files. The code for merging the gcda files also sits in the same
 function. The structure makes the code-reuse and extension really difficult
(like in kernel FDO, we have to break this function to reuse some of the code).

We propose to break gcov_exit into smaller functions and split libgcov.c into
several files. Our plan for profile-tool is listed in (3).

1. break gcov_exit() into the following structure:
   gcov_exit()
  -- gcov_exit_compute_summary()
  -- allocate_filename_struct()
  for gi_ptr in gcov_list
-- gcov_exit_dump_gcov()
   -- gcov_exit_open_gcda_file()
   -- gcov_exit_merge_gcda ()
   -- gcov_exit_write_gcda ()

2. split libgcov.c into the following files:
 libgcov-profiler.c
 libgcov-merge.c
 libgcov-interface.c
 libgcov-driver.c
   libgcov-driver-system.c (source included into libgcov-driver.c)

The details of the splitting:
libgcov-interface.c
/* This file contains API functions to other programs and the supporting
   functions.  */
  __gcov_dump()
  __gcov_execl()
  __gcov_execle()
  __gcov_execlp()
  __gcov_execv()
  __gcov_execve()
  __gcov_execvp()
  __gcov_flush()
  __gcov_fork()
  __gcov_reset()
  init_mx()
  init_mx_once()

libgcov-profiler.c
/* This file contains runtime profile handlers.  */
  variables:
__gcov_indirect_call_callee
__gcov_indirect_call_counters
  functions:
__gcov_average_profiler()
__gcov_indirect_call_profiler()
__gcov_indirect_call_profiler_v2()
__gcov_interval_profiler()
__gcov_ior_profiler()
__gcov_one_value_profiler()
__gcov_one_value_profiler_body()
__gcov_pow2_profiler()

libgcov-merge.c
/* This file contains the merge functions for various counters.  */
  functions:
__gcov_merge_add()
__gcov_merge_delta()
__gcov_merge_ior()
__gcov_merge_single()

libcov-driver.c
/* This file contains the gcov dumping functions. We separate the
   system dependent part to libgcov-driver-system.c.  */
  variables:
gcov_list
gcov_max_filename
gcov_dump_complete
-- newly added file static variables --
this_prg
all_prg
crc32
gi_filename
fn_buffer
fn_tail
sum_buffer
next_sum_buffer
sum_tail
- end -
  functions:
free_fn_data()
buffer_fn_data()
crc32_unsigned()
gcov_version()
gcov_histogram_insert()
gcov_compute_histogram()
gcov_clear()
__gcov_init()
gcov_exit()
--- newly added static functions --
gcov_exit_compute_summary ()
gcov_exit_merge_gcda()
gcov_exit_write_gcda()
gcov_exit_dump_gcov()
- end -

libgcov-driver-system.c
/* system dependent part of ligcov-driver.c.  */
  functions:
create_file_directory()
--- newly added static functions --
gcov_error() /* This replaces all fprintf(stderr, ...) */
allocate_filename_struct()
gcov_exit_open_gcda_file()

3. add offline profile-tool support.
   We will change the interface of merge functions to make it
   take in-memory gcov_info list, and a weight as the arguments.

   We will add libgcov-tool.c. It has two APIs:
   (1) read_profile_dir(): read a profile directory and reconstruct the
   gcov_info link list in-memory.
   (2) merge_profiles(): merge two in-memory gcov_info link list.

   We also add profile-tool.c in gcc directory. It will source-include
   libgcov-tool.c and build a host binary. (We don't do library style
   because that will need a hard dependence from gcc to libgcc).
   profile-tool.c will mainly handle user options and the write-out of
   gcov-info link list. Some changes in gcov-io.[ch] will be also needed.

Thanks,

[google gcc-4_8] write ggc_memory to gcda file

2013-10-25 Thread Rong Xu
Hi,

This patch writes out ggc_memory to gcda files.

Google branches only. Test is ongoing. OK to check-in after test passes?

-Rong
2013-10-25  Rong Xu  x...@google.com

* contrib/profile_tool (ModuleInfo): write out ggc_memory to gcda file.
* gcc/gcov-io.c (gcov_read_module_info): Ditto.
* libgcc/dyn-ipa.c (gcov_write_module_info): Ditto.

Index: contrib/profile_tool
===
--- contrib/profile_tool(revision 204080)
+++ contrib/profile_tool(working copy)
@@ -674,6 +674,7 @@ class ModuleInfo(DataObject):
 self.module_id = reader.ReadWord()
 self.is_primary = reader.ReadWord()
 self.flags = reader.ReadWord()
+self.ggc_memory = reader.ReadWord()
 self.language = reader.ReadWord()
 self.num_quote_paths = reader.ReadWord()
 self.num_bracket_paths = reader.ReadWord()
@@ -710,6 +711,7 @@ class ModuleInfo(DataObject):
 writer.WriteWord(self.is_primary)
 writer.WriteWord(self.flags)
 writer.WriteWord(self.language)
+writer.WriteWord(self.ggc_memory)
 writer.WriteWord(self.num_quote_paths)
 writer.WriteWord(self.num_bracket_paths)
 writer.WriteWord(self.num_system_paths)
Index: gcc/gcov-io.c
===
--- gcc/gcov-io.c   (revision 204080)
+++ gcc/gcov-io.c   (working copy)
@@ -591,14 +591,15 @@ gcov_read_module_info (struct gcov_module_info *mo
   mod_info-ident = gcov_read_unsigned ();
   mod_info-is_primary = gcov_read_unsigned ();
   mod_info-flags = gcov_read_unsigned ();
-  mod_info-lang  = gcov_read_unsigned ();
+  mod_info-lang = gcov_read_unsigned ();
+  mod_info-ggc_memory = gcov_read_unsigned ();
   mod_info-num_quote_paths = gcov_read_unsigned ();
   mod_info-num_bracket_paths = gcov_read_unsigned ();
   mod_info-num_system_paths = gcov_read_unsigned ();
   mod_info-num_cpp_defines = gcov_read_unsigned ();
   mod_info-num_cpp_includes = gcov_read_unsigned ();
   mod_info-num_cl_args = gcov_read_unsigned ();
-  len -= 10;
+  len -= 11;
 
   filename_len = gcov_read_unsigned ();
   mod_info-da_filename = (char *) xmalloc (filename_len *
Index: libgcc/dyn-ipa.c
===
--- libgcc/dyn-ipa.c(revision 204080)
+++ libgcc/dyn-ipa.c(working copy)
@@ -2095,7 +2095,7 @@ gcov_write_module_info (const struct gcov_info *mo
   len += 1; /* Each string is lead by a length.  */
 }
 
-  len += 10; /* 9 more fields */
+  len += 11; /* 11 more fields */
 
   gcov_write_tag_length (GCOV_TAG_MODULE_INFO, len);
   gcov_write_unsigned (module_info-ident);
@@ -2104,6 +2104,7 @@ gcov_write_module_info (const struct gcov_info *mo
 SET_MODULE_INCLUDE_ALL_AUX (module_info);
   gcov_write_unsigned (module_info-flags);
   gcov_write_unsigned (module_info-lang);
+  gcov_write_unsigned (module_info-ggc_memory);
   gcov_write_unsigned (module_info-num_quote_paths);
   gcov_write_unsigned (module_info-num_bracket_paths);
   gcov_write_unsigned (module_info-num_system_paths);


Re: [GOOGLE] Check if varpool node exist for decl before checking if it's from auxiliary module.

2013-10-22 Thread Rong Xu
seems fine to me for google branches.

-Rong

On Tue, Oct 22, 2013 at 10:51 AM, Dehao Chen de...@google.com wrote:
 This is fixing a LIPO bug when there -fexception is on.

 When compilation is finished, compile_file calls
 dw2_output_indirect_constants, which may generate decls like
 DW.ref.__gxx_personality_v0 (generated in
 dw2_output_indirect_constant_1). This function is a global function,
 but does not have associated varpool node. So original code will
 segfault when checking if the varpool node is from auxiliary module.

 Verified that the segfault is fixed with the patch. Regression test on-going.

 OK for google-4_8 branch if reg test is green?

 Thanks,
 Dehao

 Index: gcc/varasm.c
 ===
 --- gcc/varasm.c (revision 203910)
 +++ gcc/varasm.c (working copy)
 @@ -1484,7 +1484,7 @@ notice_global_symbol (tree decl)
if (L_IPO_COMP_MODE
 ((TREE_CODE (decl) == FUNCTION_DECL
  cgraph_is_auxiliary (decl))
 -  || (TREE_CODE (decl) == VAR_DECL
 +  || (TREE_CODE (decl) == VAR_DECL  varpool_get_node (decl)
 varpool_is_auxiliary (varpool_get_node (decl)
  return;


[google gcc-4_8] LIPO: insert static vars to varpool

2013-10-14 Thread Rong Xu
Hi,

For google gcc-4_8 branch only.

This is fixes the NULL pointer dereference in copy_tree_r due to empty
varpool_node returned.

Passed the ICE compilation. Other tests are ongoing.

Thanks,

-Rong


2013-10-14  Rong Xu  x...@google.com

* cp/semantics.c (finish_compound_literal): Put static var to
varpool. This is for LIPO only.
* cp/call.c (make_temporary_var_for_ref_to_temp): Ditto.

Index: cp/semantics.c
===
--- cp/semantics.c  (revision 203471)
+++ cp/semantics.c  (working copy)
@@ -2486,6 +2486,10 @@ finish_compound_literal (tree type, tree compound_
   decl = pushdecl_top_level (decl);
   DECL_NAME (decl) = make_anon_name ();
   SET_DECL_ASSEMBLER_NAME (decl, DECL_NAME (decl));
+  /* Capture the current module info for statics.  */
+  if (L_IPO_COMP_MODE)
+varpool_node_for_decl (decl);
+
   return decl;
 }
   else
Index: cp/call.c
===
--- cp/call.c   (revision 203471)
+++ cp/call.c   (working copy)
@@ -9047,6 +9047,9 @@ make_temporary_var_for_ref_to_temp (tree decl, tre
   tree name;

   TREE_STATIC (var) = TREE_STATIC (decl);
+  /* Capture the current module info for statics.  */
+  if (L_IPO_COMP_MODE  TREE_STATIC (var))
+varpool_node_for_decl (var);
   DECL_TLS_MODEL (var) = DECL_TLS_MODEL (decl);
   name = mangle_ref_init_variable (decl);
   DECL_NAME (var) = name;


[PATCH] increase builtin_expert probability in loop exit test

2013-10-11 Thread Rong Xu
Hi,

An earlier patch (r203167) changed the default probability for
builtin_expect to 90%.
It does not work properly for the following case:
   while (__builin_expect (expr, 1)) { }

W/o builtin_expect, the exit probability is 9% while w/
builtin_expect, the exit probability is 10%.
It seems to be wrong as we should estimate this loop has more
iterations than w/o the annotation.

This attached patch bump the probability for this particular case.

Tested with bootstrap.

Thanks,

-Rong


patch_diff
Description: Binary data


[google gcc-4_8] increase builtin_expect probability in loop exit test

2013-10-11 Thread Rong Xu
The trunk version of this patch is submitted for review.

David: can we have this patch for google/gcc-4_8 branch first?
It tested with regression and google internal benchmarks.

Thanks,

-Rong
2013-10-11  Rong Xu  x...@google.com

* predict.c (tree_predict_by_opcode): Bump the 
estimiated probability if builtin_expect expression is
in loop exit test.

Index: predict.c
===
--- predict.c   (revision 203462)
+++ predict.c   (working copy)
@@ -1951,7 +1951,31 @@ tree_predict_by_opcode (basic_block bb)
   if (val)
 {
   int percent = PARAM_VALUE (BUILTIN_EXPECT_PROBABILITY);
+  void **preds;
 
+  /* This handles the cases like
+   while (__builtin_expect (exp, 1)) { ... }
+ W/o builtin_expect, the default HITRATE is 91%.
+It does not make sense to estimate a lower probability of 90%
+(current default for builtin_expect) with the annotation.
+So here, we bump the probability by a small amount.  */
+  preds = pointer_map_contains (bb_predictions, bb);
+  if (preds)
+{
+  struct edge_prediction *pred;
+
+  for (pred = (struct edge_prediction *) *preds; pred;
+  pred = pred-ep_next)
+   {
+ if (pred-ep_predictor == PRED_LOOP_EXIT
+  predictor_info [(int) PRED_LOOP_EXIT].hitrate
+ HITRATE (percent))
+   percent += 4;
+ if (percent  100)
+   percent = 100;
+   }
+   }
+
   gcc_assert (percent = 0  percent = 100);
   if (integer_zerop (val))
 percent = 100 - percent;


Re: [google gcc-4_8] increase builtin_expect probability in loop exit test

2013-10-11 Thread Rong Xu
I want to differentiate the cases w/o and w/ builtin.
If I take the max, they will be the same (91%).

-Rong

On Fri, Oct 11, 2013 at 2:26 PM, Xinliang David Li davi...@google.com wrote:
 Why this 'percent += 4'  instead of taking the max?

 David

 On Fri, Oct 11, 2013 at 2:10 PM, Rong Xu x...@google.com wrote:
 The trunk version of this patch is submitted for review.

 David: can we have this patch for google/gcc-4_8 branch first?
 It tested with regression and google internal benchmarks.

 Thanks,

 -Rong


Re: [google gcc-4_8] increase builtin_expect probability in loop exit test

2013-10-11 Thread Rong Xu
ok. that makes sense.


On Fri, Oct 11, 2013 at 2:41 PM, Xinliang David Li davi...@google.com wrote:
 Should it be max + some_delta then?

 David

 On Fri, Oct 11, 2013 at 2:32 PM, Rong Xu x...@google.com wrote:
 I want to differentiate the cases w/o and w/ builtin.
 If I take the max, they will be the same (91%).

 -Rong

 On Fri, Oct 11, 2013 at 2:26 PM, Xinliang David Li davi...@google.com 
 wrote:
 Why this 'percent += 4'  instead of taking the max?

 David

 On Fri, Oct 11, 2013 at 2:10 PM, Rong Xu x...@google.com wrote:
 The trunk version of this patch is submitted for review.

 David: can we have this patch for google/gcc-4_8 branch first?
 It tested with regression and google internal benchmarks.

 Thanks,

 -Rong


Re: [google gcc-4_8] increase builtin_expect probability in loop exit test

2013-10-11 Thread Rong Xu
here is the new patch. Note that the hitrate won't change by this
patch for the case of
  while (__builtin_expect (exp, 0))

Index: predict.c
===
--- predict.c   (revision 203462)
+++ predict.c   (working copy)
@@ -1951,11 +1951,42 @@ tree_predict_by_opcode (basic_block bb)
   if (val)
 {
   int percent = PARAM_VALUE (BUILTIN_EXPECT_PROBABILITY);
+  int hitrate;

   gcc_assert (percent = 0  percent = 100);
   if (integer_zerop (val))
-percent = 100 - percent;
-  predict_edge (then_edge, PRED_BUILTIN_EXPECT, HITRATE (percent));
+hitrate = HITRATE (100 - percent);
+  else
+{
+  /* This handles the cases like
+   while (__builtin_expect (exp, 1)) { ... }
+ W/o builtin_expect, the default HITRATE is 91%.
+ It does not make sense to estimate a lower probability of 90%
+ (current default for builtin_expect) with the annotation.
+ So here, we bump the probability by a small amount.  */
+  void **preds = pointer_map_contains (bb_predictions, bb);
+
+  hitrate = HITRATE (percent);
+  if (preds)
+{
+  struct edge_prediction *pred;
+  int exit_hitrate = predictor_info [(int) PRED_LOOP_EXIT].hitrate;
+
+  for (pred = (struct edge_prediction *) *preds; pred;
+   pred = pred-ep_next)
+{
+  if (pred-ep_predictor == PRED_LOOP_EXIT
+   exit_hitrate  hitrate)
+{
+  hitrate = exit_hitrate + HITRATE (4);
+  if (hitrate  REG_BR_PROB_BASE)
+hitrate = REG_BR_PROB_BASE;
+  break;
+}
+}
+}
+}
+  predict_edge (then_edge, PRED_BUILTIN_EXPECT, hitrate);
 }
   /* Try pointer heuristic.
  A comparison ptr == 0 is predicted as false.

On Fri, Oct 11, 2013 at 2:45 PM, Rong Xu x...@google.com wrote:
 ok. that makes sense.


 On Fri, Oct 11, 2013 at 2:41 PM, Xinliang David Li davi...@google.com wrote:
 Should it be max + some_delta then?

 David

 On Fri, Oct 11, 2013 at 2:32 PM, Rong Xu x...@google.com wrote:
 I want to differentiate the cases w/o and w/ builtin.
 If I take the max, they will be the same (91%).

 -Rong

 On Fri, Oct 11, 2013 at 2:26 PM, Xinliang David Li davi...@google.com 
 wrote:
 Why this 'percent += 4'  instead of taking the max?

 David

 On Fri, Oct 11, 2013 at 2:10 PM, Rong Xu x...@google.com wrote:
 The trunk version of this patch is submitted for review.

 David: can we have this patch for google/gcc-4_8 branch first?
 It tested with regression and google internal benchmarks.

 Thanks,

 -Rong


Re: [PATCH] increase builtin_expert probability in loop exit test

2013-10-11 Thread Rong Xu
Attached is the patchset 2. It takes the max to two hitrates then does
the incremental.


On Fri, Oct 11, 2013 at 2:01 PM, Rong Xu x...@google.com wrote:
 Hi,

 An earlier patch (r203167) changed the default probability for
 builtin_expect to 90%.
 It does not work properly for the following case:
while (__builin_expect (expr, 1)) { }

 W/o builtin_expect, the exit probability is 9% while w/
 builtin_expect, the exit probability is 10%.
 It seems to be wrong as we should estimate this loop has more
 iterations than w/o the annotation.

 This attached patch bump the probability for this particular case.

 Tested with bootstrap.

 Thanks,

 -Rong
2013-10-11  Rong Xu  x...@google.com

* predict.c (tree_predict_by_opcode): Bump the 
estimiated probability if builtin_expect expression is
in loop exit test.

Index: predict.c
===
--- predict.c   (revision 203463)
+++ predict.c   (working copy)
@@ -2001,11 +2001,39 @@ tree_predict_by_opcode (basic_block bb)
   if (val)
 {
   int percent = PARAM_VALUE (BUILTIN_EXPECT_PROBABILITY);
+  void **preds;
+  int hitrate;
 
   gcc_assert (percent = 0  percent = 100);
+  /* This handles the cases like
+   while (__builtin_expect (exp, 1)) { ... }
+ W/o builtin_expect, the default HITRATE is 91%.
+ It does not make sense to estimate a lower probability of 90%
+ (current default for builtin_expect) with the annotation.
+ So here, we bump the probability by a small amount.  */
+  preds = pointer_map_contains (bb_predictions, bb);
+  hitrate = HITRATE (percent);
+  if (preds)
+{
+  struct edge_prediction *pred;
+  int exit_hitrate = predictor_info [(int) PRED_LOOP_EXIT].hitrate;
+
+  for (pred = (struct edge_prediction *) *preds; pred;
+   pred = pred-ep_next)
+{
+  if (pred-ep_predictor == PRED_LOOP_EXIT
+   exit_hitrate  hitrate)
+{
+  hitrate = exit_hitrate + HITRATE (4);
+  if (hitrate  REG_BR_PROB_BASE)
+hitrate = REG_BR_PROB_BASE;
+  break;
+}
+}
+}
   if (integer_zerop (val))
-percent = 100 - percent;
-  predict_edge (then_edge, PRED_BUILTIN_EXPECT, HITRATE (percent));
+hitrate = REG_BR_PROB_BASE - hitrate;
+  predict_edge (then_edge, PRED_BUILTIN_EXPECT, hitrate);
 }
   /* Try pointer heuristic.
  A comparison ptr == 0 is predicted as false.


Re: [PATCH] alternative hirate for builtin_expert

2013-10-04 Thread Rong Xu
My change on the probability of builtin_expect does have an impact on
the partial inline (more outlined functions will get inline back to
the original function).
I think this triggers an existing issue.
Dehao will explain this in his coming email.

-Rong

On Fri, Oct 4, 2013 at 6:05 AM, Ramana Radhakrishnan ramra...@arm.com wrote:
 On 10/02/13 23:49, Rong Xu wrote:

 Here is the new patch. Honaz: Could you take a look?

 Thanks,

 -Rong

 On Wed, Oct 2, 2013 at 2:31 PM, Jan Hubicka hubi...@ucw.cz wrote:

 Thanks for the suggestion. This is much cleaner than to use binary
 parameter.

 Just want to make sure I understand it correctly about the orginal
 hitrate:
 you want to retire the hitrate in PRED_BUILTIN_EXPECT and always use
 the one specified in the biniltin-expect-probability parameter.


 Yes.


 Should I use 90% as the default? It's hard to fit current value 0.9996
 in percent form.


 Yes, 90% seems fine.  The original value was set quite arbitrarily and no
 real
 performance study was made as far as I know except yours. I think users
 that
 are sure they use expect to gueard completely cold edges may just use
 100%
 instead of 0.9996, so I would not worry much about the precision.

 Honza


 -Rong


 OK with that change.

 Honza



 This broke arm-linux-gnueabihf building libstdc++-v3. I haven't dug further
 yet but still reducing the testcase.

 See

 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58619

 for details.

 Can you please deal with this appropriately ?

 regards
 Ramana




Re: [PATCH] alternative hirate for builtin_expert

2013-10-02 Thread Rong Xu
On Wed, Oct 2, 2013 at 9:08 AM, Jan Hubicka hubi...@ucw.cz wrote:
  Hi,
 
 
 
Current default probability for builtin_expect is 0.9996.
  This makes the freq of unlikely bb very low (4), which
  suppresses the inlining of any calls within those bb.
 
  We used FDO data to measure the branch probably for
  the branch annotated with builtin_expert.
  For google internal benchmarks, the weight average
  (the profile count value as the weight) is 0.9081.
 
  Linux kernel is another program that is heavily annotated
  with builtin-expert. We measured its weight average as 0.8717,
  using google search as the workload.
 
  This patch sets the alternate hirate probability for builtin_expert
  to 90%. With the alternate hirate, we measured performance
  improvement for google benchmarks and Linux kernel.
 
  An earlier discussion is
  https://mail.google.com/mail/u/0/?pli=1#label/gcc-paches/1415c5910054630b
 
  This new patch is for the trunk and addresses Honza's comments.
 
  Honza: this new probability is off by default. When we backport to google
  branch we will make it the default. Let me know if you want to do the same
  here.

 I do not like much the binary parameter for 
 builtin-expect-probability-relaxed.

 I would just add bulitin-expect-probability taking value in percents and then
 make predict.c to use it. Just use predict_edge instead of predict_edge_def
 and document hitrate value as unused in predict.def.

Thanks for the suggestion. This is much cleaner than to use binary parameter.

Just want to make sure I understand it correctly about the orginal hitrate:
you want to retire the hitrate in PRED_BUILTIN_EXPECT and always use
the one specified in the biniltin-expect-probability parameter.

Should I use 90% as the default? It's hard to fit current value 0.9996
in percent form.

-Rong

 OK with that change.

 Honza


Re: [PATCH] fix size_estimation for builtin_expect

2013-10-02 Thread Rong Xu
attached the new patch. OK for check in?

On Wed, Oct 2, 2013 at 9:12 AM, Jan Hubicka hubi...@ucw.cz wrote:
  Hi,
 
  builtin_expect should be a NOP in size_estimation. Indeed, the call
  stmt itself is 0 weight in size and time. But it may introduce
  an extra relation expr which has non-zero size/time. The end result
  is: for w/ and w/o builtin_expect, we have different size/time estimation
  for inlining.
 
  This patch fixes this problem.
 
  An earlier discussion of this patch is
https://mail.google.com/mail/u/0/?pli=1#label/gcc-paches/1415c590ad8c5315
 
  This new patch address Honza's comments.
  It passes the bootstrap and regression.
 
  Richard: I looked at your tree-ssa.c:walk_use_def_chains() code. I think
  that's an overkill for this simple problem. Your code is mostly dealing
  with the recursively walk the PHI node to find the real def stmts.
  Here the traversal is within one BB and I may need to continue on multiple
  real assignment. Calling walk_use_def_chains probably only uses
  the SSA_NAME_DEF_STMT() part of the code.
 
  Thanks,
 
  -Rong

 This patch is OK.  Add white space after
 +  bool match = false;
 +  bool done = false;

 and fix
 +  if (match  single_imm_use (var, use_p, use_stmt) 
 +  (gimple_code (use_stmt) == GIMPLE_COND))

  should be at beggining of new line..

 Thanks,
 Honza


p1_patch_v2
Description: Binary data


Re: [PATCH] alternative hirate for builtin_expert

2013-10-02 Thread Rong Xu
Here is the new patch. Honaz: Could you take a look?

Thanks,

-Rong

On Wed, Oct 2, 2013 at 2:31 PM, Jan Hubicka hubi...@ucw.cz wrote:
 Thanks for the suggestion. This is much cleaner than to use binary parameter.

 Just want to make sure I understand it correctly about the orginal hitrate:
 you want to retire the hitrate in PRED_BUILTIN_EXPECT and always use
 the one specified in the biniltin-expect-probability parameter.

 Yes.

 Should I use 90% as the default? It's hard to fit current value 0.9996
 in percent form.

 Yes, 90% seems fine.  The original value was set quite arbitrarily and no real
 performance study was made as far as I know except yours. I think users that
 are sure they use expect to gueard completely cold edges may just use 100%
 instead of 0.9996, so I would not worry much about the precision.

 Honza

 -Rong
 
  OK with that change.
 
  Honza


p2_patch_v2
Description: Binary data


Re: [PATCH] fix size_estimation for builtin_expect

2013-10-01 Thread Rong Xu
ping.

On Fri, Sep 27, 2013 at 3:56 PM, Rong Xu x...@google.com wrote:
 Hi,

 builtin_expect should be a NOP in size_estimation. Indeed, the call
 stmt itself is 0 weight in size and time. But it may introduce
 an extra relation expr which has non-zero size/time. The end result
 is: for w/ and w/o builtin_expect, we have different size/time estimation
 for inlining.

 This patch fixes this problem.

 An earlier discussion of this patch is
   https://mail.google.com/mail/u/0/?pli=1#label/gcc-paches/1415c590ad8c5315

 This new patch address Honza's comments.
 It passes the bootstrap and regression.

 Richard: I looked at your tree-ssa.c:walk_use_def_chains() code. I think
 that's an overkill for this simple problem. Your code is mostly dealing
 with the recursively walk the PHI node to find the real def stmts.
 Here the traversal is within one BB and I may need to continue on multiple
 real assignment. Calling walk_use_def_chains probably only uses
 the SSA_NAME_DEF_STMT() part of the code.

 Thanks,

 -Rong


Re: [PATCH] alternative hirate for builtin_expert

2013-10-01 Thread Rong Xu
ping.

On Fri, Sep 27, 2013 at 3:53 PM, Rong Xu x...@google.com wrote:
 Hi,



   Current default probability for builtin_expect is 0.9996.
 This makes the freq of unlikely bb very low (4), which
 suppresses the inlining of any calls within those bb.

 We used FDO data to measure the branch probably for
 the branch annotated with builtin_expert.
 For google internal benchmarks, the weight average
 (the profile count value as the weight) is 0.9081.

 Linux kernel is another program that is heavily annotated
 with builtin-expert. We measured its weight average as 0.8717,
 using google search as the workload.

 This patch sets the alternate hirate probability for builtin_expert
 to 90%. With the alternate hirate, we measured performance
 improvement for google benchmarks and Linux kernel.

 An earlier discussion is
 https://mail.google.com/mail/u/0/?pli=1#label/gcc-paches/1415c5910054630b

 This new patch is for the trunk and addresses Honza's comments.

 Honza: this new probability is off by default. When we backport to google
 branch we will make it the default. Let me know if you want to do the same
 here.

 Thanks,

 -Rong


Re: [google gcc-4_8] fix size_estimation for builtin_expect

2013-09-27 Thread Rong Xu
On Fri, Sep 27, 2013 at 1:20 AM, Richard Biener
richard.guent...@gmail.com wrote:
 On Fri, Sep 27, 2013 at 12:23 AM, Jan Hubicka hubi...@ucw.cz wrote:
 Hi,

 builtin_expect should be a NOP in size_estimation. Indeed, the call
 stmt itself is 0 weight in size and time. But it may introduce
 an extra relation expr which has non-zero size/time. The end result
 is: for w/ and w/o builtin_expect, we have different size/time estimation
 for early inlining.

 This patch fixes this problem.

 -Rong

 2013-09-26  Rong Xu  x...@google.com

   * ipa-inline-analysis.c (estimate_function_body_sizes): fix
 the size estimation for builtin_expect.

 This seems fine with an comment in the code what it is about.
 I also think we want to support mutiple builtin_expects in a BB so perhaps
 we want to have pointer set of statements to ignore?

 To avoid spagetti code, please just move the new logic into separate 
 functions.

 Looks like this could use tree-ssa.c:walk_use_def_chains (please
 change its implementation as necessary, make it C++, etc. - you will
 be the first user again).


Thanks for the suggestion. But it seems walk_use_def_chains() was
removed by r201951.

-Rong

 Richard.

 Honza

 Index: ipa-inline-analysis.c
 ===
 --- ipa-inline-analysis.c (revision 202638)
 +++ ipa-inline-analysis.c (working copy)
 @@ -2266,6 +2266,8 @@ estimate_function_body_sizes (struct cgraph_node *
/* Estimate static overhead for function prologue/epilogue and 
 alignment. */
int overhead = PARAM_VALUE (PARAM_INLINE_FUNCTION_OVERHEAD_SIZE);
int size = overhead;
 +  gimple fix_expect_builtin;
 +
/* Benefits are scaled by probability of elimination that is in range
   0,2.  */
basic_block bb;
 @@ -2359,14 +2361,73 @@ estimate_function_body_sizes (struct cgraph_node *
   }
   }

 +  fix_expect_builtin = NULL;
for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (bsi))
   {
 gimple stmt = gsi_stmt (bsi);
 +   if (gimple_call_builtin_p (stmt, BUILT_IN_EXPECT))
 +{
 +  tree var = gimple_call_lhs (stmt);
 +  tree arg = gimple_call_arg (stmt, 0);
 +  use_operand_p use_p;
 +  gimple use_stmt;
 +  bool match = false;
 +  bool done = false;
 +  gcc_assert (var  arg);
 +  gcc_assert (TREE_CODE (var) == SSA_NAME);
 +
 +  while (TREE_CODE (arg) == SSA_NAME)
 +{
 +  gimple stmt_tmp = SSA_NAME_DEF_STMT (arg);
 +  switch (gimple_assign_rhs_code (stmt_tmp))
 +{
 +  case LT_EXPR:
 +  case LE_EXPR:
 +  case GT_EXPR:
 +  case GE_EXPR:
 +  case EQ_EXPR:
 +  case NE_EXPR:
 +match = true;
 +done = true;
 +break;
 +  case NOP_EXPR:
 +break;
 +  default:
 +done = true;
 +break;
 +}
 +  if (done)
 +break;
 +  arg = gimple_assign_rhs1 (stmt_tmp);
 +}
 +
 +  if (match  single_imm_use (var, use_p, use_stmt))
 +{
 +  if (gimple_code (use_stmt) == GIMPLE_COND)
 +{
 +  fix_expect_builtin = use_stmt;
 +}
 +}
 +
 +  /* we should see one builtin_expert call in one bb.  */
 +  break;
 +}
 +}
 +
 +  for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (bsi))
 + {
 +   gimple stmt = gsi_stmt (bsi);
 int this_size = estimate_num_insns (stmt, eni_size_weights);
 int this_time = estimate_num_insns (stmt, eni_time_weights);
 int prob;
 struct predicate will_be_nonconstant;

 +   if (stmt == fix_expect_builtin)
 +{
 +  this_size--;
 +  this_time--;
 +}
 +
 if (dump_file  (dump_flags  TDF_DETAILS))
   {
 fprintf (dump_file,   );



Re: [google gcc-4_8] fix size_estimation for builtin_expect

2013-09-27 Thread Rong Xu
On Thu, Sep 26, 2013 at 3:23 PM, Jan Hubicka hubi...@ucw.cz wrote:
 Hi,

 builtin_expect should be a NOP in size_estimation. Indeed, the call
 stmt itself is 0 weight in size and time. But it may introduce
 an extra relation expr which has non-zero size/time. The end result
 is: for w/ and w/o builtin_expect, we have different size/time estimation
 for early inlining.

 This patch fixes this problem.

 -Rong

 2013-09-26  Rong Xu  x...@google.com

   * ipa-inline-analysis.c (estimate_function_body_sizes): fix
 the size estimation for builtin_expect.

 This seems fine with an comment in the code what it is about.
 I also think we want to support mutiple builtin_expects in a BB so perhaps
 we want to have pointer set of statements to ignore?


Thanks for feedback. I'll add the comment and split the code into a
new function.

You have a good pointer about multiple builtin_expect. I think I need
to remove the early exit (the break stmt).
But I would argue that using pointer_set is probably not necessary.

Here is the reasoning: The typical usage of builtin_expect is like:
  if (__builtin_expect (a=b, 1))  { ... }
The IR is:
  t1 = a = b;
  t2 = (long int) t1;
  t3 = __builtin_expect (t2, 1);
  if (t3 != 0) goto ...
Without the builtin, the IR is
  if (a=b) goto...

The code in the patch check the use of the builtin_expect return
value, to see if it's
used in a COND stmt. So even there are multiple builtins, only one can match.

-Rong

 To avoid spagetti code, please just move the new logic into separate 
 functions.

 Honza

 Index: ipa-inline-analysis.c
 ===
 --- ipa-inline-analysis.c (revision 202638)
 +++ ipa-inline-analysis.c (working copy)
 @@ -2266,6 +2266,8 @@ estimate_function_body_sizes (struct cgraph_node *
/* Estimate static overhead for function prologue/epilogue and alignment. 
 */
int overhead = PARAM_VALUE (PARAM_INLINE_FUNCTION_OVERHEAD_SIZE);
int size = overhead;
 +  gimple fix_expect_builtin;
 +
/* Benefits are scaled by probability of elimination that is in range
   0,2.  */
basic_block bb;
 @@ -2359,14 +2361,73 @@ estimate_function_body_sizes (struct cgraph_node *
   }
   }

 +  fix_expect_builtin = NULL;
for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (bsi))
   {
 gimple stmt = gsi_stmt (bsi);
 +   if (gimple_call_builtin_p (stmt, BUILT_IN_EXPECT))
 +{
 +  tree var = gimple_call_lhs (stmt);
 +  tree arg = gimple_call_arg (stmt, 0);
 +  use_operand_p use_p;
 +  gimple use_stmt;
 +  bool match = false;
 +  bool done = false;
 +  gcc_assert (var  arg);
 +  gcc_assert (TREE_CODE (var) == SSA_NAME);
 +
 +  while (TREE_CODE (arg) == SSA_NAME)
 +{
 +  gimple stmt_tmp = SSA_NAME_DEF_STMT (arg);
 +  switch (gimple_assign_rhs_code (stmt_tmp))
 +{
 +  case LT_EXPR:
 +  case LE_EXPR:
 +  case GT_EXPR:
 +  case GE_EXPR:
 +  case EQ_EXPR:
 +  case NE_EXPR:
 +match = true;
 +done = true;
 +break;
 +  case NOP_EXPR:
 +break;
 +  default:
 +done = true;
 +break;
 +}
 +  if (done)
 +break;
 +  arg = gimple_assign_rhs1 (stmt_tmp);
 +}
 +
 +  if (match  single_imm_use (var, use_p, use_stmt))
 +{
 +  if (gimple_code (use_stmt) == GIMPLE_COND)
 +{
 +  fix_expect_builtin = use_stmt;
 +}
 +}
 +
 +  /* we should see one builtin_expert call in one bb.  */
 +  break;
 +}
 +}
 +
 +  for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (bsi))
 + {
 +   gimple stmt = gsi_stmt (bsi);
 int this_size = estimate_num_insns (stmt, eni_size_weights);
 int this_time = estimate_num_insns (stmt, eni_time_weights);
 int prob;
 struct predicate will_be_nonconstant;

 +   if (stmt == fix_expect_builtin)
 +{
 +  this_size--;
 +  this_time--;
 +}
 +
 if (dump_file  (dump_flags  TDF_DETAILS))
   {
 fprintf (dump_file,   );



Re: [google/4_8] Disable -g/-gmlt during LIPO instrumentation

2013-09-27 Thread Rong Xu
I don't quite understand here. We use the profile-generate memory
consumption to estimate the profile use memory consumption.
we still have -g/-gmlt in profile-use compilation. Will this change
effectively under estimate the memory use in the use phrase?

-Rong

On Fri, Sep 27, 2013 at 11:50 AM, Teresa Johnson tejohn...@google.com wrote:
 David and Rong,

 The following patch will disable -g/-gmlt when instrumenting for LIPO
 since they will affect the recorded ggc_memory used in the module
 grouping decision. Added -fripa-allow-debug to override this behavior.

 Passes regression tests and simple tests on the new functionality.

 Ok for google/4_8?

 Teresa

 2013-09-27  Teresa Johnson  tejohn...@google.com

 * opts.c (finish_options): Suppress -g/-gmlt when we are
 building for LIPO instrumention.
 * common.opt (fripa-allow-debug): New option.

 Index: opts.c
 ===
 --- opts.c  (revision 202976)
 +++ opts.c  (working copy)
 @@ -799,7 +799,7 @@ finish_options (struct gcc_options *opts, struct g
  #endif
if (!opts-x_flag_fat_lto_objects  !HAVE_LTO_PLUGIN)
  error_at (loc, -fno-fat-lto-objects are supported only with
 linker plugin.);
 -}
 +}
if ((opts-x_flag_lto_partition_balanced != 0) +
 (opts-x_flag_lto_partition_1to1 != 0)
 + (opts-x_flag_lto_partition_none != 0) = 1)
  {
 @@ -852,6 +852,26 @@ finish_options (struct gcc_options *opts, struct g
/* Turn on -ffunction-sections when -freorder-functions=* is used.  */
if (opts-x_flag_reorder_functions  1)
  opts-x_flag_function_sections = 1;
 +
 +  /* LIPO module grouping depends on the memory consumed by the profile-gen
 + parsing phase, recorded in a per-module ggc_memory field of the module
 + info struct. This will be higher when debug generation is on via
 + -g/-gmlt, which causes the FE to generate debug structures that will
 + increase the ggc_total_memory. This could in theory cause the module
 + groups to be slightly more conservative. Disable -g/-gmlt under
 + -fripa -fprofile-generate, but provide an option to override this
 + in case we actually need to debug an instrumented binary.  */
 +  if (opts-x_profile_arc_flag
 +   opts-x_flag_dyn_ipa
 +   opts-x_debug_info_level != DINFO_LEVEL_NONE
 +   !opts-x_flag_dyn_ipa_allow_debug)
 +{
 +  inform (loc,
 + Debug generation via -g option disabled under -fripa 
 +  -fprofile-generate (use -fripa-allow-debug to override));
 +  set_debug_level (NO_DEBUG, DEFAULT_GDB_EXTENSIONS, 0, opts, opts_set,
 +   loc);
 +}
  }

  #define LEFT_COLUMN27
 Index: common.opt
 ===
 --- common.opt  (revision 202976)
 +++ common.opt  (working copy)
 @@ -1155,6 +1155,10 @@ fripa
  Common Report Var(flag_dyn_ipa)
  Perform Dynamic Inter-Procedural Analysis.

 +fripa-allow-debug
 +Common Report Var(flag_dyn_ipa_allow_debug) Init(0)
 +Allow -g enablement for -fripa -fprofile-generate compiles.
 +
  fripa-disallow-asm-modules
  Common Report Var(flag_ripa_disallow_asm_modules)
  Don't import an auxiliary module if it contains asm statements

 --
 Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413


Re: [google gcc-4_8] alternate hirate for builtin_expert

2013-09-27 Thread Rong Xu
On Thu, Sep 26, 2013 at 3:27 PM, Jan Hubicka hubi...@ucw.cz wrote:
 Hi,

 Current default probably for builtin_expect is 0.9996.
 This makes the freq of unlikely bb very low (4), which
 suppresses the inlining of any calls within those bb.

 We used FDO data to measure the branch probably for
 the branch annotated with builtin_expert.
  For google
 internal benchmarks, the weight average
 (the profile count value as the weight) is 0.9081.

 Linux kernel is another program that is heavily annotated
 with builtin-expert. We measured its weight average as 0.8717,
   using google search as
 the workload.

 This is interesting.  I was always unsure if programmers use builtin_expect
 more often to mark an impossible paths (as those leading to crash) or just
 unlikely paths.  Your data seems to suggest the second.


I don't find an official guideline on how this should be used. Maybe
some documentation
in gcc user-guide helps.

 We probably also ought to get analyze_brprob working again. It was always
 useful to get such a data.


 This patch sets the alternate hirate probability for
 builtin_expert
 to 90%. With the alternate hirate, we measured performance
   improvement for google
 benchmarks and Linux kernel.


   -Rong
 2013-09-26  Rong Xu  x...@google.com

   * params.def (DEFPARAM): New.
   * params.def: New.
   * predict.c (tree_predict_by_opcode): Alternate
 probablity hirate for builtin_expect.

 This also seems resonable for mainline.  Please add a comment
 to the parameter explaining how the value was determined.
 Please also add invoke.texi documentation.


Will do.

 For patches that seems resonable for mainline in FDO/IPA area,
 i would apprechiate if you just added me to CC, so I do not lose
 track of them.

Will do.

 Honza


[PATCH] alternative hirate for builtin_expert

2013-09-27 Thread Rong Xu
Hi,



  Current default probability for builtin_expect is 0.9996.
This makes the freq of unlikely bb very low (4), which
suppresses the inlining of any calls within those bb.

We used FDO data to measure the branch probably for
the branch annotated with builtin_expert.
For google internal benchmarks, the weight average
(the profile count value as the weight) is 0.9081.

Linux kernel is another program that is heavily annotated
with builtin-expert. We measured its weight average as 0.8717,
using google search as the workload.

This patch sets the alternate hirate probability for builtin_expert
to 90%. With the alternate hirate, we measured performance
improvement for google benchmarks and Linux kernel.

An earlier discussion is
https://mail.google.com/mail/u/0/?pli=1#label/gcc-paches/1415c5910054630b

This new patch is for the trunk and addresses Honza's comments.

Honza: this new probability is off by default. When we backport to google
branch we will make it the default. Let me know if you want to do the same
here.

Thanks,

-Rong


p2_patch
Description: Binary data


[PATCH] fix size_estimation for builtin_expect

2013-09-27 Thread Rong Xu
Hi,

builtin_expect should be a NOP in size_estimation. Indeed, the call
stmt itself is 0 weight in size and time. But it may introduce
an extra relation expr which has non-zero size/time. The end result
is: for w/ and w/o builtin_expect, we have different size/time estimation
for inlining.

This patch fixes this problem.

An earlier discussion of this patch is
  https://mail.google.com/mail/u/0/?pli=1#label/gcc-paches/1415c590ad8c5315

This new patch address Honza's comments.
It passes the bootstrap and regression.

Richard: I looked at your tree-ssa.c:walk_use_def_chains() code. I think
that's an overkill for this simple problem. Your code is mostly dealing
with the recursively walk the PHI node to find the real def stmts.
Here the traversal is within one BB and I may need to continue on multiple
real assignment. Calling walk_use_def_chains probably only uses
the SSA_NAME_DEF_STMT() part of the code.

Thanks,

-Rong


p1_patch
Description: Binary data


Re: [PATCH] Sanitize block partitioning under -freorder-blocks-and-partition

2013-09-26 Thread Rong Xu
On Tue, Sep 24, 2013 at 5:31 AM, Jan Hubicka hubi...@ucw.cz wrote:
 Hi Honza,

 I am finally getting back to working on this after a few weeks of
 working on some other priorities.

 I am also trying to return to this, so good timming ;)
 Martin has got smaller C++ programs (Inkscape) to not touch cold segment
 during the startup with FDO (w/o partitioning). Firefox still does, I think
 the problem are lost samples due to different linker decisions even with LTO.
 (i.e. linker pick an object from .a libraryat profile-generate time that i 
 never
 passes later.).

 I plan to look into that today.

 Did you mean to commit the above change? I see that it went in as part
 of r202258 but doesn't show up in the ChangeLog entry for that
 revision.

 Yes, I meant to check it in, but did not mean to do so w/o Changelog.  I wil
 fix that.

 
  In other cases it was mostly loop unrolling in combination with jump 
  threading. So
  I modified my script to separately report when failure happens for test 
  trained
  once and test trained hundred times.

 Thanks for the linker script. I reproduced your results. I looked at a
 couple cases. The first was one that failed after 1 training run only
 (2910-2.c). It was due to jump threading, which you noted was a
 problem. For this one I think we can handle it in the partitioning,
 since there is an FDO insanity that we could probably treat more
 conservatively when splitting.

 We should fix the roundoff issues - when I was introducing the
 frequency/probability/count system I made an assumptions that parts of 
 programs
 with very low counts do not matter, since they are not part of hot spot (and I
 cared only about the hot spot).  Now we care about identifying unlikely
 executed spots and we need to fix this.

 I looked at one that failed after 100 as well (20031204-1.c). In this
 case, it was due to expansion which was creating multiple branches/bbs
 from a logical OR and guessing incorrectly on how to assign the
 counts:

  if (octets == 4  (*cp == ':' || *cp == '\0')) {

 The (*cp == ':' || *cp == '\0') part looked like the following going
 into RTL expansion:

   [20031204-1.c : 31:33] _29 = _28 == 58;
   [20031204-1.c : 31:33] _30 = _28 == 0;
   [20031204-1.c : 31:33] _31 = _29 | _30;
   [20031204-1.c : 31:18] if (_31 != 0)
 goto bb 16;
   else
 goto bb 19;

 where the result of the OR was always true, so bb 16 had a count of
 100 and bb 19 a count of 0. When it was expanded, the expanded version
 of the above turned into 2 bbs with a branch in between. Both
 comparisons were done in the first bb, but the first bb checked
 whether the result of the *cp == '\0' compare was true, and if not
 branched to the check for whether the *cp == ':' compare was true. It
 gave the branch to the second check against ':' a count of 0, so that
 bb got a count of 0 and was split out, and put the count of 100 on the
 fall through assuming the compare with '\0' always evaluated to true.
 In reality, this OR condition was always true because *cp was ':', not
 '\0'. Therefore, the count of 0 on the second block with the check for
 ':' was incorrect, we ended up trying to execute it, and failed.

 Presumably we had the correct profile data for both blocks, but the
 accuracy was reduced when the OR was represented as a logical
 computation with a single branch. We could change the expansion code
 to do something different, e.g. treat as a 50-50 branch. But we would
 still end up with integer truncation issues when there was a single
 training run. But that could be dealt with conservatively in the
 bbpart code as I suggested for the jump threading issue above. I.e. a
 cold block with incoming non-cold edges conservatively not marked cold
 for splitting.

 
  FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/2422-1.c
  FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/2910-2.c
  FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20020413-1.c
  FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20030903-1.c
  FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20031204-1.c
  FAIL /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20031204-1.c
  FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20060420-1.c
  FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20060905-1.c
  FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20120427-1.c
  FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20120427-2.c
  FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20120808-1.c
  FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20121108-1.c
  FAIL /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20121108-1.c
  FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/920501-6.c
  FAIL /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/920501-6.c
  FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/920726-1.c
  FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/981001-1.c
  FAIL 

[google gcc-4_8] fix size_estimation for builtin_expect

2013-09-26 Thread Rong Xu
Hi,

builtin_expect should be a NOP in size_estimation. Indeed, the call
stmt itself is 0 weight in size and time. But it may introduce
an extra relation expr which has non-zero size/time. The end result
is: for w/ and w/o builtin_expect, we have different size/time estimation
for early inlining.

This patch fixes this problem.

-Rong
2013-09-26  Rong Xu  x...@google.com

* ipa-inline-analysis.c (estimate_function_body_sizes): fix
the size estimation for builtin_expect.

Index: ipa-inline-analysis.c
===
--- ipa-inline-analysis.c   (revision 202638)
+++ ipa-inline-analysis.c   (working copy)
@@ -2266,6 +2266,8 @@ estimate_function_body_sizes (struct cgraph_node *
   /* Estimate static overhead for function prologue/epilogue and alignment. */
   int overhead = PARAM_VALUE (PARAM_INLINE_FUNCTION_OVERHEAD_SIZE);
   int size = overhead;
+  gimple fix_expect_builtin;
+
   /* Benefits are scaled by probability of elimination that is in range
  0,2.  */
   basic_block bb;
@@ -2359,14 +2361,73 @@ estimate_function_body_sizes (struct cgraph_node *
}
}
 
+  fix_expect_builtin = NULL;
   for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (bsi))
{
  gimple stmt = gsi_stmt (bsi);
+ if (gimple_call_builtin_p (stmt, BUILT_IN_EXPECT))
+{
+  tree var = gimple_call_lhs (stmt);
+  tree arg = gimple_call_arg (stmt, 0);
+  use_operand_p use_p;
+  gimple use_stmt;
+  bool match = false;
+  bool done = false;
+  gcc_assert (var  arg);
+  gcc_assert (TREE_CODE (var) == SSA_NAME);
+
+  while (TREE_CODE (arg) == SSA_NAME)
+{
+  gimple stmt_tmp = SSA_NAME_DEF_STMT (arg);
+  switch (gimple_assign_rhs_code (stmt_tmp))
+{
+  case LT_EXPR:
+  case LE_EXPR:
+  case GT_EXPR:
+  case GE_EXPR:
+  case EQ_EXPR:
+  case NE_EXPR:
+match = true;
+done = true;
+break;
+  case NOP_EXPR:
+break;
+  default:
+done = true;
+break;
+}
+  if (done)
+break;
+  arg = gimple_assign_rhs1 (stmt_tmp);
+}
+
+  if (match  single_imm_use (var, use_p, use_stmt))
+{
+  if (gimple_code (use_stmt) == GIMPLE_COND)
+{
+  fix_expect_builtin = use_stmt;
+}
+}
+
+  /* we should see one builtin_expert call in one bb.  */
+  break;
+}
+}
+
+  for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (bsi))
+   {
+ gimple stmt = gsi_stmt (bsi);
  int this_size = estimate_num_insns (stmt, eni_size_weights);
  int this_time = estimate_num_insns (stmt, eni_time_weights);
  int prob;
  struct predicate will_be_nonconstant;
 
+ if (stmt == fix_expect_builtin)
+{
+  this_size--;
+  this_time--;
+}
+
  if (dump_file  (dump_flags  TDF_DETAILS))
{
  fprintf (dump_file,   );


[google gcc-4_8] alternate hirate for builtin_expert

2013-09-26 Thread Rong Xu
Hi,

Current default probably for builtin_expect is 0.9996.
This makes the freq of unlikely bb very low (4), which
suppresses the inlining of any calls within those bb.

We used FDO data to measure the branch probably for
the branch annotated with builtin_expert.
 For google
internal benchmarks, the weight average
(the profile count value as the weight) is 0.9081.

Linux kernel is another program that is heavily annotated
with builtin-expert. We measured its weight average as 0.8717,
  using google search as
the workload.


This patch sets the alternate hirate probability for
builtin_expert
to 90%. With the alternate hirate, we measured performance
  improvement for google
benchmarks and Linux kernel.


  -Rong
2013-09-26  Rong Xu  x...@google.com

* params.def (DEFPARAM): New.
* params.def: New.
* predict.c (tree_predict_by_opcode): Alternate 
probablity hirate for builtin_expect.

Index: params.def
===
--- params.def  (revision 202638)
+++ params.def  (working copy)
@@ -483,6 +483,10 @@ DEFPARAM(TRACER_MIN_BRANCH_PROBABILITY,
 tracer-min-branch-probability,
 Stop forward growth if the probability of best edge is less than this 
threshold (in percent). Used when profile feedback is not available,
 50, 0, 100)
+DEFPARAM(BUILTIN_EXPECT_PROBABILITY_RELAXED,
+builtin-expect-probability-relaxed,
+Set the estimated probability for builtin expect. By default using 
PROB_VERY_LIKELY, while value of 1 using HIRATE(90) probability.,
+0, 0, 1)
 
 /* The maximum number of incoming edges to consider for crossjumping.  */
 DEFPARAM(PARAM_MAX_CROSSJUMP_EDGES,
Index: params.def
===
--- params.def  (revision 202638)
+++ params.def  (working copy)
@@ -483,6 +483,10 @@ DEFPARAM(TRACER_MIN_BRANCH_PROBABILITY,
 tracer-min-branch-probability,
 Stop forward growth if the probability of best edge is less than this 
threshold (in percent). Used when profile feedback is not available,
 50, 0, 100)
+DEFPARAM(BUILTIN_EXPECT_PROBABILITY_RELAXED,
+builtin-expect-probability-relaxed,
+Set the estimated probability for builtin expect. By default using 
PROB_VERY_LIKELY, while value of 1 using HIRATE(90) probability.,
+0, 0, 1)
 
 /* The maximum number of incoming edges to consider for crossjumping.  */
 DEFPARAM(PARAM_MAX_CROSSJUMP_EDGES,
Index: predict.c
===
--- predict.c   (revision 202638)
+++ predict.c   (working copy)
@@ -1950,10 +1950,17 @@ tree_predict_by_opcode (basic_block bb)
   BITMAP_FREE (visited);
   if (val)
 {
+  enum br_predictor predictor;
+
+  if (PARAM_VALUE (BUILTIN_EXPECT_PROBABILITY_RELAXED) == 0)
+predictor = PRED_BUILTIN_EXPECT;
+  else
+predictor = PRED_BUILTIN_EXPECT_RELAXED;
+
   if (integer_zerop (val))
-   predict_edge_def (then_edge, PRED_BUILTIN_EXPECT, NOT_TAKEN);
+   predict_edge_def (then_edge, predictor, NOT_TAKEN);
   else
-   predict_edge_def (then_edge, PRED_BUILTIN_EXPECT, TAKEN);
+   predict_edge_def (then_edge, predictor, TAKEN);
   return;
 }
   /* Try pointer heuristic.


Re: [PATCH] Sanitize block partitioning under -freorder-blocks-and-partition

2013-09-26 Thread Rong Xu
On Thu, Sep 26, 2013 at 2:54 PM, Jan Hubicka hubi...@ucw.cz wrote:
  As for COMDAT merging, i would like to see the patch.  I am experimenting
  now with a patch to also privatize COMDATs during -fprofile-generate to
  avoid problems with lost profiles mentioned above.
 

 Do you mean you privatize every COMDAT function in the profile-generate?
 We discussed this idea internally and we thought it would not work for
 large applications (like in google) due to size.

 Yes, Martin and I plan to test this on firefox.  In a way you already have all
 the COMDAT functions unshared in the object files, so the resulting binary
 should not be completely off the limits.  But I do not have any quantitative
 data, yet, since we hit bug in constant folding and devirtualization I fixed 
 in
 meantime but we did not re-run the tests yet.

LInker removes a great numbers of duplicated copies, esp for those
template functions.
We don't have a quantitative numbers either. But I'll collect some soon.


  As for context sensitivity, one approach would be to have two sets of
  counters for every comdat - one merged globally and one counting local
  instances.  We can then privatize always and at profile read in stage
  just clone every comdat and have two instances - one for offline copy
  and one for inlining.
 

 In my implementation, I also allow multiple sets of COMDAT profile
 co-existing in one compilation.
 Due to the auxiliary modules in LIPO, I actually have more than two.

 How does auxiliary modules work?

It pulls in multiple profiles from other compilation. So there might be multiple
inlined profiles.


 But I'm wondering how do you determine which profile to use for each
 call-site -- the inline decision may not
 be the same for profile-generate and profile-use compilation.

 My suggestion was to simply use the module local profile for all inline sites
 within the given module and the global profile for the offline copy of the
 function (that one will, in the case it survives linking, be shared across
 all the modules anyway).

For simple example like:
callsite1 -- comcat_function_foo
callsite2 -- comdat_function_foo

callsite1 is inlined in profile-generate, it has its own inlined
profile counter.
callsite2 is not inlined and the profile goes to the offline copies.
let's callsite 1 is cold (0 counter) and callsite 2 is hot. Using
local profile (the cold one)
for callsite2 will not be correct.


 I think this may work in the cases where i.e. use of hash templates in one
 module is very different (in average size) from other module.
 I did not really put much effort into it - I currently worry primarily about
 the cases where profile is lost completely since it gets attached to a 
 function
 not surviving final linking (or because we inline something we did not inlined
 at profile time).

 As for context sensitivity, we may try to consider developing more consistent
 solution for this.  COMDAT functions are definitely not only that may exhibit
 context sensitive behaviour.
 One approach would be to always have multiple counters for each function and
 hash based on cbacktraces collected by indirect call profiling 
 instrumentation.
 In a way this is same path profiling, but that would definitely add quite some
 overhead + we will need to think of resonable way to represent this within
 compiler.

 How do you decide what functions you want to have multiple profiles for?

I do the instrumentation after ipa-inline for comdat function. I know
if a callsite
is inlined or not. In profile-use phrase, I also need to provide to
the context (which module this is from) to pick
the right profile.


 Honza


Re: [PATCH] Sanitize block partitioning under -freorder-blocks-and-partition

2013-08-30 Thread Rong Xu
On Fri, Aug 30, 2013 at 7:50 AM, Teresa Johnson tejohn...@google.com wrote:
 Can someone review and ok the attached patch for trunk? It has been
 bootstrapped and tested on x86-64-unknown-linux-gnu, and tested by
 enabling -freorder-blocks-and-partition enabled for a
 profiledbootstrap as well.

 (Honza, see more responses inlined below. Rong, please see note below as 
 well).

 Thanks,
 Teresa

 On Fri, Aug 30, 2013 at 2:14 AM, Jan Hubicka hubi...@ucw.cz wrote:
 Great! Is this the LTO merging you were talking about in an earlier
 message, or the gcov runtime fix (that would presumably not be
 lto-specific)?

 It is the LTO path - we need to merge profiles there anyway for his code 
 unification
 work.

 Rong - can you send a summary of the approach you are working on? Is
 it LIPO-specific?

I'm also working to improve COMDAT handling in FDO/LIPO.  It's
applicable to regular FDO.
Our motivation case is different from the case talked here:
  COMDAT function F is defined in module A and is picked by linker as
the out-of-line copy in profile-gen phrase.
It gets all the counters. In profile-use compilation F may not be
emitted in A (like all the callsites are ipa-inlined in A),
We choose instance of F in module B as the out-of-line copy, and it's
not optimized.

We are seeing more of this kind of cases in LIPO due to multiple
COMDAT copies brought by the auxiliary modules.

Since COMDAT function may be inlined after instrumentation, multiple
copies of the counters make be co-exists
We want to differentiate inlined-copy counters or out-of-line-copy counters.
(In LIPO, we actually encourage the inline of COMDAT in IPA-inline to
get more context sensitive counters.)

Our current solution is to have another instrumentation only for
COMDAT functions.
* For each  comdat_key, we create a global var pointing to the
gcov_fn_info of the out-of-line copy.
* This global var is initialized by the instrumentation code placed in
the function entry, to the value of gcov_fn_info of current module.
   This is post IPA-inline instrumentation. So at most one
instrumentation (the one picked by linker) is executed.
* Expand gcov_fn_info to points to the global var.
In the run time, we can differentiate if the counter out-of-line copy
or inlined copy. We set a tag in gcda file accordingly.
(in the case of both out-of-line copy and inlined copy, we treat it as
out-of-line copy).

In the profile-use phrase, we use this info to resolve the decls of
COMDAT functions, and also make sure we only emit
the out-of-line copy of the COMDAT function (even if it's not
referenced in the module).

This has been done and we are testing it now.

I talked to Teresa about the her case some time back. It seems that
merging the counters is an easy change with the above patch because
we have the address of the out-of-line gcov_fn_info. We can do a
simple in-memory merge if the checksum matches. The concern is
that this may reduce the context sensitive information.

-Rong



  I have patch to track this.  Moreover vforks seems to produce repeated
  merging of results.

 Aha, does this explain the gimp issue as well?

 Not really - we still need to debug why we hit cold section so many times 
 with
 partitioning.  I sitll think easier approach will be to lock the cold 
 section and
 then start probably with testsuite (i.e. write script to compile the small 
 testcases
 with FDO + partitioning and see what crash by hitting cold section).

 Ok, that is on my todo list.


 
  Is it really necessary to run this from internal loop of the cfgcleanup?

 The reason I added it here is that just below there is a call to
 verify_flow_info, and that will fail with the new verification.

 Hmm, OK, I suppose we run the cleanup after partitioning just once or twice, 
 right?
 We can track this incrementally - I am not sure if we put it from the 
 internal iteration
 loop we would get anything substantial either.
 Removing unreachable blocks twice is however ugly.

 When I was debugging the issue that led to this change I seemed to see
 1-2 iterations typically. Although I haven't measured it
 scientifically. It would be good to revisit that and see if we can
 pull both parts out of the loop, but as a separate patch.


 +/* Ensure that all hot bbs are included in a hot path through the
 +   procedure. This is done by calling this function twice, once
 +   with WALK_UP true (to look for paths from the entry to hot bbs) and
 +   once with WALK_UP false (to look for paths from hot bbs to the exit).
 +   Returns the updated value of COLD_BB_COUNT and adds newly-hot bbs
 +   to BBS_IN_HOT_PARTITION.  */
 +
 +static unsigned int
 +sanitize_hot_paths (bool walk_up, unsigned int cold_bb_count,
 +vecbasic_block *bbs_in_hot_partition)
 +{
 +  /* Callers check this.  */
 +  gcc_checking_assert (cold_bb_count);
 +
 +  /* Keep examining hot bbs while we still have some left to check
 + and there are remaining cold bbs.  */
 +  vecbasic_block hot_bbs_to_check = 

[Google] fix a bug in lipo varpool node linking

2013-08-16 Thread Rong Xu
This patch fixed a bug in lipo varpool node linking.

C++ FE drops the initializer if it's not used in this TU. For current
varpool linking may
resolve the varpool node to the one with null initializer.

-Rong


Index: l-ipo.c
===
--- l-ipo.c (revision 201800)
+++ l-ipo.c (working copy)
@@ -2151,6 +2151,19 @@ resolve_varpool_node (struct varpool_node **slot,
   return;
 }

+  if (DECL_INITIAL (decl1)  !DECL_INITIAL (decl2))
+{
+  merge_addressable_attr (decl1, decl2);
+  return;
+}
+
+  if (!DECL_INITIAL (decl1)  DECL_INITIAL (decl2))
+{
+  *slot = node;
+  merge_addressable_attr (decl2, decl1);
+  return;
+}
+
   /* Either all complete or neither's type is complete. Just
  pick the primary module's decl.  */
   if (!varpool_is_auxiliary (*slot))


Re: [Google] Fix profiledbootstrap failure

2013-07-30 Thread Rong Xu
We have seen the issue before. It does fail the profile boostrap as it
reads a wrong gcda file.
I thought it had been fixed. (The fix was as David mentioned, setting
the default value of the parameter to 0).

-Rong

On Tue, Jul 30, 2013 at 12:02 PM, Xinliang David Li davi...@google.com wrote:
 I need to understand why this affects profile bootstrap -- is this due
 to file name conflict?

 The fix is wrong -- please do not remove the parameter. If it is a
 problem, a better fix is to change the default parameter value to 0.

 David


 On Tue, Jul 30, 2013 at 11:56 AM, Teresa Johnson tejohn...@google.com wrote:
 cc'ing Rong and David since this came from LIPO support.

 The patch as-is removes the one use of PARAM_GCOV_DEBUG (which is on
 by default) without removing the parameter itself. What is the failure
 mode you see from this code?

 Thanks,
 Teresa

 On Tue, Jul 30, 2013 at 11:50 AM, Dinar Temirbulatov
 di...@kugelworks.com wrote:
 Hello

 This change allows to complete profiledbootstrap on the google gcc-4.8
 branch, tested with make bootstrap with no new regressions.  OK for
 google 4.8?
thanks, Dinar.



 --
 Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413


Re: [Google] Fix profiledbootstrap failure

2013-07-30 Thread Rong Xu
Will do.

The patch was in gcc-4_7 by Dehao.

r194713 | dehao | 2012-12-24 16:49:06 -0800 (Mon, 24 Dec 2012) | 5 lines

Fix the profiled bootstrap:

1. Set the default value of gcov-debug to be 0.
2. Merge profile summaries from different instrumented binaries.

On Tue, Jul 30, 2013 at 12:58 PM, Xinliang David Li davi...@google.com wrote:
 Ok. Rong, can you help commit the parameter default setting patch?

 thanks,

 David

 On Tue, Jul 30, 2013 at 12:48 PM, Rong Xu x...@google.com wrote:
 We have seen the issue before. It does fail the profile boostrap as it
 reads a wrong gcda file.
 I thought it had been fixed. (The fix was as David mentioned, setting
 the default value of the parameter to 0).

 -Rong

 On Tue, Jul 30, 2013 at 12:02 PM, Xinliang David Li davi...@google.com 
 wrote:
 I need to understand why this affects profile bootstrap -- is this due
 to file name conflict?

 The fix is wrong -- please do not remove the parameter. If it is a
 problem, a better fix is to change the default parameter value to 0.

 David


 On Tue, Jul 30, 2013 at 11:56 AM, Teresa Johnson tejohn...@google.com 
 wrote:
 cc'ing Rong and David since this came from LIPO support.

 The patch as-is removes the one use of PARAM_GCOV_DEBUG (which is on
 by default) without removing the parameter itself. What is the failure
 mode you see from this code?

 Thanks,
 Teresa

 On Tue, Jul 30, 2013 at 11:50 AM, Dinar Temirbulatov
 di...@kugelworks.com wrote:
 Hello

 This change allows to complete profiledbootstrap on the google gcc-4.8
 branch, tested with make bootstrap with no new regressions.  OK for
 google 4.8?
thanks, Dinar.



 --
 Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413


[google gcc-4_8] Force cmd-line match for option -ansi in LIPO use

2013-07-30 Thread Rong Xu
The following patch forces the command line match for -ansi option
in LIPO use build. Otherwise, it gets various undefined symbol errors.
This is exposed in LIPO random grouping test.

Tested with google internal benchmarks and gcc regression test.

2013-07-30  Rong Xu  x...@google.com

* gcc/coverage.c (force_matching_cg_opts): require
cmd line match on -ansi option in LIPO use build.

Index: gcc/coverage.c
===
--- gcc/coverage.c  (revision 201219)
+++ gcc/coverage.c  (working copy)
@@ -263,6 +263,7 @@ static struct opt_desc force_matching_cg_opts[] =
 { -fsized-delete, -fno-sized-delete, false },
 { -frtti, -fno-rtti, true },
 { -fstrict-aliasing, -fno-strict-aliasing, true },
+{ -ansi, , false },
 { NULL, NULL, false }
   };


  1   2   >