On 2018-07-20 06:05 AM, Richard Biener wrote:
> On Fri, Jul 20, 2018 at 4:48 AM Michael Ploujnikov
> <michael.ploujni...@oracle.com> wrote:
>>
>> On 2018-07-17 04:25 PM, Michael Ploujnikov wrote:
>>> On 2018-07-17 06:02 AM, Richard Biener wrote:
>>>> On Tue, Jul 17, 2018 at 8:10 AM Bernhard Reutner-Fischer
>>>> <rep.dot....@gmail.com> wrote:
>>>>>
>>>>> On 16 July 2018 21:38:36 CEST, Michael Ploujnikov 
>>>>> <michael.ploujni...@oracle.com> wrote:
>>>>>> Hi,
>>>>>>
>>>>>
>>>>>> +    clone_fn_ids = hash_map<const char *, unsigned>::create_ggc
>>>>>> (1000);
>>>>>
>>>>> Isn't 1000 a bit excessive? What about 64 or thereabouts?
>>>>
>>>> I'm not sure we should throw memory at this "problem" in this way.
>>>> What specific issue
>>>> does this address?
>>>
>>> This goes along with the general theme of preventing changes to one 
>>> function affecting codegen of others. What I'm seeing in this case is when 
>>> a function bar is modified codegen decides to create more clones of it (eg: 
>>> during the constprop pass). These extra clones cause the global counter to 
>>> increment so the clones of the unchanged function foo are named differently 
>>> only because of a source change to bar. I was hoping that the testcase 
>>> would be a good illustration, but perhaps not; is there anything else I can 
>>> do to make this clearer?
>>>
>>>
>>>>
>>>> Iff then I belive forgoing the automatic counter addidion is the way
>>>> to go and hand
>>>> control of that to the callers (for example the caller in
>>>> lto/lto-partition.c doesn't
>>>> even seem to need that counter.
>>>
>>> How can you tell that privatize_symbol_name_1 doesn't need the counter? I'm 
>>> assuming it has a good reason to call clone_function_name_1 rather than 
>>> appending ".lto_priv" itself.
>>>
>>>> You also assume the string you key on persists - luckily the
>>>> lto-partition.c caller
>>>> leaks it but this makes your approach quite fragile in my eye (put the 
>>>> logic
>>>> into clone_function_name instead, where you at least know you are dealing
>>>> with a string from an indentifier which are never collected).
>>>>
>>>> Richard.
>>>>
>>>
>>> Is this what you had in mind?:
>>>
>>> diff --git gcc/cgraphclones.c gcc/cgraphclones.c
>>> index 6e84a31..f000420 100644
>>> --- gcc/cgraphclones.c
>>> +++ gcc/cgraphclones.c
>>> @@ -512,7 +512,7 @@ cgraph_node::create_clone (tree new_decl, profile_count 
>>> prof_count,
>>>    return new_node;
>>>  }
>>>
>>> -static GTY(()) unsigned int clone_fn_id_num;
>>> +static GTY(()) hash_map<const char *, unsigned> *clone_fn_ids;
>>>
>>>  /* Return a new assembler name for a clone with SUFFIX of a decl named
>>>     NAME.  */
>>> @@ -521,14 +521,13 @@ tree
>>>  clone_function_name_1 (const char *name, const char *suffix)
>>>  {
>>>    size_t len = strlen (name);
>>> -  char *tmp_name, *prefix;
>>> +  char *prefix;
>>>
>>>    prefix = XALLOCAVEC (char, len + strlen (suffix) + 2);
>>>    memcpy (prefix, name, len);
>>>    strcpy (prefix + len + 1, suffix);
>>>    prefix[len] = symbol_table::symbol_suffix_separator ();
>>> -  ASM_FORMAT_PRIVATE_NAME (tmp_name, prefix, clone_fn_id_num++);
>>> -  return get_identifier (tmp_name);
>>> +  return get_identifier (prefix);
>>>  }
>>>
>>>  /* Return a new assembler name for a clone of DECL with SUFFIX.  */
>>> @@ -537,7 +536,17 @@ tree
>>>  clone_function_name (tree decl, const char *suffix)
>>>  {
>>>    tree name = DECL_ASSEMBLER_NAME (decl);
>>> -  return clone_function_name_1 (IDENTIFIER_POINTER (name), suffix);
>>> +  char *decl_name = IDENTIFIER_POINTER (name);
>>> +  char *numbered_name;
>>> +  unsigned int *suffix_counter;
>>> +  if (!clone_fn_ids) {
>>> +    /* Initialize the per-function counter hash table if this is the first 
>>> call */
>>> +    clone_fn_ids = hash_map<const char *, unsigned>::create_ggc (64);
>>> +  }
>>> +  suffix_counter = &clone_fn_ids->get_or_insert(name);
>>> +  ASM_FORMAT_PRIVATE_NAME (numbered_name, decl_name, *suffix_counter);
>>> +  *suffix_counter = *suffix_counter + 1;
>>> +  return clone_function_name_1 (numbered_name, suffix);
>>>  }
>>>
>>> - Michael
>>>
>>>
>>>
>>
>> Ping, and below is the updated version of the full patch with changelogs:
>>
>>
>> gcc:
>> 2018-07-16  Michael Ploujnikov  <michael.ploujni...@oracle.com>
>>
>>        Make function clone name numbering independent.
>>        * cgraphclones.c: Replace clone_fn_id_num with clone_fn_ids.
>>        (clone_function_name_1): Move suffixing to clone_function_name
>>        and change it to use per-function clone_fn_ids.
>>
>> testsuite:
>> 2018-07-16  Michael Ploujnikov  <michael.ploujni...@oracle.com>
>>
>>         Clone id counters should be completely independent from one another.
>>         * gcc/testsuite/gcc.dg/independent-cloneids-1.c: New test.
>>
>> ---
>>  gcc/cgraphclones.c                            | 19 ++++++++++----
>>  gcc/testsuite/gcc.dg/independent-cloneids-1.c | 38 
>> +++++++++++++++++++++++++++
>>  2 files changed, 52 insertions(+), 5 deletions(-)
>>  create mode 100644 gcc/testsuite/gcc.dg/independent-cloneids-1.c
>>
>> diff --git gcc/cgraphclones.c gcc/cgraphclones.c
>> index 6e84a31..e1a77a2 100644
>> --- gcc/cgraphclones.c
>> +++ gcc/cgraphclones.c
>> @@ -512,7 +512,7 @@ cgraph_node::create_clone (tree new_decl, profile_count 
>> prof_count,
>>    return new_node;
>>  }
>>
>> -static GTY(()) unsigned int clone_fn_id_num;
>> +static GTY(()) hash_map<const char *, unsigned> *clone_fn_ids;
>>
>>  /* Return a new assembler name for a clone with SUFFIX of a decl named
>>     NAME.  */
>> @@ -521,14 +521,13 @@ tree
>>  clone_function_name_1 (const char *name, const char *suffix)
> 
> pass this function the counter to use....
> 
>>  {
>>    size_t len = strlen (name);
>> -  char *tmp_name, *prefix;
>> +  char *prefix;
>>
>>    prefix = XALLOCAVEC (char, len + strlen (suffix) + 2);
>>    memcpy (prefix, name, len);
>>    strcpy (prefix + len + 1, suffix);
>>    prefix[len] = symbol_table::symbol_suffix_separator ();
>> -  ASM_FORMAT_PRIVATE_NAME (tmp_name, prefix, clone_fn_id_num++);
> 
> and keep using ASM_FORMAT_PRIVATE_NAME here.  You need to change
> the lto/lto-partition.c caller (just use zero as counter).
> 
>> -  return get_identifier (tmp_name);
>> +  return get_identifier (prefix);
>>  }
>>
>>  /* Return a new assembler name for a clone of DECL with SUFFIX.  */
>> @@ -537,7 +536,17 @@ tree
>>  clone_function_name (tree decl, const char *suffix)
>>  {
>>    tree name = DECL_ASSEMBLER_NAME (decl);
>> -  return clone_function_name_1 (IDENTIFIER_POINTER (name), suffix);
>> +  const char *decl_name = IDENTIFIER_POINTER (name);
>> +  char *numbered_name;
>> +  unsigned int *suffix_counter;
>> +  if (!clone_fn_ids) {
>> +    /* Initialize the per-function counter hash table if this is the first 
>> call */
>> +    clone_fn_ids = hash_map<const char *, unsigned>::create_ggc (64);
>> +  }
> 
> I still do not like throwing memory at the problem in this way for the
> little benefit
> this change provides :/
> 
> So no approval from me at this point...
> 
> Richard.

Can you give me an idea of the memory constraints that are involved?

The highest memory usage increase that I could find was when compiling
a source file (from Linux) with roughly 10,000 functions. It showed a 2kB
increase over the before-patch use of 6936kB which is barely 0.03%.

Using a single counter can result in more confusing namespacing when
you have .bar.clone.4 despite there only being 3 clones of .bar.

From a practical point of view this change is helpful to anyone
diffing binary output such as forensic analysts, Debian Reproducible
Builds or even someone validating compiler output (before and after an input
source patch). The extra changes that this patch alleviates are a
distraction and could even be misleading. For example, applying a
source patch to the same Linux source produces the following binary
diff before my change:

--- /tmp/output.o.objdump
+++ /tmp/patched-output.o.objdump
@@ -1,5 +1,5 @@

-/tmp/uverbs_cmd/output.o:     file format elf32-i386
+/tmp/uverbs_cmd/patched-output.o:     file format elf32-i386


 Disassembly of section .text.get_order:
@@ -265,12 +265,12 @@
    3:  e9 fc ff ff ff          jmp    4 <put_cq_read+0x4>
                        4: R_386_PC32   .text.put_uobj_read

-Disassembly of section .text.trace_kmalloc.constprop.3:
+Disassembly of section .text.trace_kmalloc.constprop.4:

-00000000 <trace_kmalloc.constprop.3>:
+00000000 <trace_kmalloc.constprop.4>:
    0:  83 3d 04 00 00 00 00    cmpl   $0x0,0x4
                        2: R_386_32     __tracepoint_kmalloc
-   7:  74 34                   je     3d <trace_kmalloc.constprop.3+0x3d>
+   7:  74 34                   je     3d <trace_kmalloc.constprop.4+0x3d>
    9:  55                      push   %ebp
    a:  89 cd                   mov    %ecx,%ebp
    c:  57                      push   %edi
@@ -281,7 +281,7 @@
   13:  8b 1d 10 00 00 00       mov    0x10,%ebx
                        15: R_386_32    __tracepoint_kmalloc
   19:  85 db                   test   %ebx,%ebx
-  1b:  74 1b                   je     38 <trace_kmalloc.constprop.3+0x38>
+  1b:  74 1b                   je     38 <trace_kmalloc.constprop.4+0x38>
   1d:  68 d0 00 00 00          push   $0xd0
   22:  89 fa                   mov    %edi,%edx
   24:  89 f0                   mov    %esi,%eax
@@ -292,7 +292,7 @@
   31:  58                      pop    %eax
   32:  83 3b 00                cmpl   $0x0,(%ebx)
   35:  5a                      pop    %edx
-  36:  eb e3                   jmp    1b <trace_kmalloc.constprop.3+0x1b>
+  36:  eb e3                   jmp    1b <trace_kmalloc.constprop.4+0x1b>
   38:  5b                      pop    %ebx
   39:  5e                      pop    %esi
   3a:  5f                      pop    %edi
@@ -846,7 +846,7 @@
   78:  b8 5f 00 00 00          mov    $0x5f,%eax
                        79: R_386_32    .text.ib_uverbs_alloc_pd
   7d:  e8 fc ff ff ff          call   7e <ib_uverbs_alloc_pd+0x7e>
-                       7e: R_386_PC32  .text.trace_kmalloc.constprop.3
+                       7e: R_386_PC32  .text.trace_kmalloc.constprop.4
   82:  c7 45 d4 f4 ff ff ff    movl   $0xfffffff4,-0x2c(%ebp)
   89:  59                      pop    %ecx
   8a:  85 db                   test   %ebx,%ebx
@@ -1068,7 +1068,7 @@
   9c:  b8 83 00 00 00          mov    $0x83,%eax
                        9d: R_386_32    .text.ib_uverbs_reg_mr
   a1:  e8 fc ff ff ff          call   a2 <ib_uverbs_reg_mr+0xa2>
-                       a2: R_386_PC32  .text.trace_kmalloc.constprop.3
+                       a2: R_386_PC32  .text.trace_kmalloc.constprop.4
   a6:  ba f4 ff ff ff          mov    $0xfffffff4,%edx
   ab:  58                      pop    %eax
   ac:  85 db                   test   %ebx,%ebx
@@ -1385,7 +1385,7 @@
   99:  b8 7b 00 00 00          mov    $0x7b,%eax
                        9a: R_386_32    .text.ib_uverbs_create_cq
   9e:  e8 fc ff ff ff          call   9f <ib_uverbs_create_cq+0x9f>
-                       9f: R_386_PC32  .text.trace_kmalloc.constprop.3
+                       9f: R_386_PC32  .text.trace_kmalloc.constprop.4
   a3:  58                      pop    %eax
   a4:  85 db                   test   %ebx,%ebx
   a6:  75 0a                   jne    b2 <ib_uverbs_create_cq+0xb2>
@@ -1607,129 +1607,107 @@
 00000000 <ib_uverbs_poll_cq>:
    0:  55                      push   %ebp
    1:  57                      push   %edi
-   2:  89 c7                   mov    %eax,%edi
-   4:  56                      push   %esi
-   5:  89 ce                   mov    %ecx,%esi
-   7:  b9 10 00 00 00          mov    $0x10,%ecx
-   c:  53                      push   %ebx
-   d:  83 ec 18                sub    $0x18,%esp
-  10:  8d 44 24 08             lea    0x8(%esp),%eax
-  14:  e8 fc ff ff ff          call   15 <ib_uverbs_poll_cq+0x15>
-                       15: R_386_PC32  copy_from_user
-  19:  85 c0                   test   %eax,%eax
-  1b:  0f 85 34 01 00 00       jne    155 <ib_uverbs_poll_cq+0x155>
-  21:  6b 44 24 14 34          imul   $0x34,0x14(%esp),%eax
-  26:  ba d0 00 00 00          mov    $0xd0,%edx
-  2b:  e8 fc ff ff ff          call   2c <ib_uverbs_poll_cq+0x2c>
-                       2c: R_386_PC32  __kmalloc
-  30:  89 c5                   mov    %eax,%ebp
-  32:  85 c0                   test   %eax,%eax
-  34:  0f 84 22 01 00 00       je     15c <ib_uverbs_poll_cq+0x15c>
-  3a:  6b 44 24 14 30          imul   $0x30,0x14(%esp),%eax
-  3f:  ba d0 00 00 00          mov    $0xd0,%edx
-  44:  83 c0 08                add    $0x8,%eax
-  47:  89 44 24 04             mov    %eax,0x4(%esp)
-  4b:  e8 fc ff ff ff          call   4c <ib_uverbs_poll_cq+0x4c>
-                       4c: R_386_PC32  __kmalloc
-  50:  ba f4 ff ff ff          mov    $0xfffffff4,%edx
-  55:  89 04 24                mov    %eax,(%esp)
-  58:  85 c0                   test   %eax,%eax
-  5a:  0f 84 e1 00 00 00       je     141 <ib_uverbs_poll_cq+0x141>
-  60:  8b 4f 58                mov    0x58(%edi),%ecx
-  63:  6a 00                   push   $0x0
-  65:  b8 00 00 00 00          mov    $0x0,%eax
-                       66: R_386_32    ib_uverbs_cq_idr
-  6a:  8b 54 24 14             mov    0x14(%esp),%edx
-  6e:  e8 fc ff ff ff          call   6f <ib_uverbs_poll_cq+0x6f>
-                       6f: R_386_PC32  .text.idr_read_obj
-  73:  ba ea ff ff ff          mov    $0xffffffea,%edx
-  78:  89 c7                   mov    %eax,%edi
-  7a:  58                      pop    %eax
-  7b:  85 ff                   test   %edi,%edi
-  7d:  0f 84 ae 00 00 00       je     131 <ib_uverbs_poll_cq+0x131>
-  83:  8b 1f                   mov    (%edi),%ebx
-  85:  8b 54 24 14             mov    0x14(%esp),%edx
-  89:  89 e9                   mov    %ebp,%ecx
-  8b:  89 f8                   mov    %edi,%eax
-  8d:  ff 93 70 01 00 00       call   *0x170(%ebx)
-  93:  8b 1c 24                mov    (%esp),%ebx
-  96:  89 03                   mov    %eax,(%ebx)
-  98:  89 f8                   mov    %edi,%eax
-  9a:  e8 fc ff ff ff          call   9b <ib_uverbs_poll_cq+0x9b>
-                       9b: R_386_PC32  .text.put_cq_read
-  9f:  8b 1c 24                mov    (%esp),%ebx
-  a2:  89 e8                   mov    %ebp,%eax
-  a4:  6b 3b 34                imul   $0x34,(%ebx),%edi
-  a7:  8d 53 08                lea    0x8(%ebx),%edx
-  aa:  01 ef                   add    %ebp,%edi
-  ac:  39 f8                   cmp    %edi,%eax
-  ae:  74 67                   je     117 <ib_uverbs_poll_cq+0x117>
-  b0:  8b 08                   mov    (%eax),%ecx
-  b2:  8b 58 04                mov    0x4(%eax),%ebx
-  b5:  83 c2 30                add    $0x30,%edx
-  b8:  83 c0 34                add    $0x34,%eax
-  bb:  89 4a d0                mov    %ecx,-0x30(%edx)
-  be:  89 5a d4                mov    %ebx,-0x2c(%edx)
-  c1:  8b 48 d4                mov    -0x2c(%eax),%ecx
-  c4:  89 4a d8                mov    %ecx,-0x28(%edx)
-  c7:  8b 48 d8                mov    -0x28(%eax),%ecx
-  ca:  89 4a dc                mov    %ecx,-0x24(%edx)
-  cd:  8b 48 dc                mov    -0x24(%eax),%ecx
-  d0:  89 4a e0                mov    %ecx,-0x20(%edx)
-  d3:  8b 48 e0                mov    -0x20(%eax),%ecx
-  d6:  89 4a e4                mov    %ecx,-0x1c(%edx)
-  d9:  8b 48 e8                mov    -0x18(%eax),%ecx
-  dc:  89 4a e8                mov    %ecx,-0x18(%edx)
-  df:  8b 48 e4                mov    -0x1c(%eax),%ecx
-  e2:  8b 49 20                mov    0x20(%ecx),%ecx
-  e5:  89 4a ec                mov    %ecx,-0x14(%edx)
-  e8:  8b 48 ec                mov    -0x14(%eax),%ecx
-  eb:  89 4a f0                mov    %ecx,-0x10(%edx)
-  ee:  8b 48 f0                mov    -0x10(%eax),%ecx
-  f1:  89 4a f4                mov    %ecx,-0xc(%edx)
-  f4:  8b 48 f4                mov    -0xc(%eax),%ecx
-  f7:  66 89 4a f8             mov    %cx,-0x8(%edx)
-  fb:  66 8b 48 f6             mov    -0xa(%eax),%cx
-  ff:  66 89 4a fa             mov    %cx,-0x6(%edx)
- 103:  8a 48 f8                mov    -0x8(%eax),%cl
- 106:  88 4a fc                mov    %cl,-0x4(%edx)
- 109:  8a 48 f9                mov    -0x7(%eax),%cl
- 10c:  88 4a fd                mov    %cl,-0x3(%edx)
- 10f:  8a 48 fa                mov    -0x6(%eax),%cl
- 112:  88 4a fe                mov    %cl,-0x2(%edx)
- 115:  eb 95                   jmp    ac <ib_uverbs_poll_cq+0xac>
- 117:  8b 14 24                mov    (%esp),%edx
- 11a:  8b 4c 24 04             mov    0x4(%esp),%ecx
- 11e:  8b 44 24 08             mov    0x8(%esp),%eax
- 122:  e8 fc ff ff ff          call   123 <ib_uverbs_poll_cq+0x123>
-                       123: R_386_PC32 copy_to_user
- 127:  83 f8 01                cmp    $0x1,%eax
- 12a:  19 d2                   sbb    %edx,%edx
- 12c:  f7 d2                   not    %edx
- 12e:  83 e2 f2                and    $0xfffffff2,%edx
- 131:  8b 04 24                mov    (%esp),%eax
- 134:  89 54 24 04             mov    %edx,0x4(%esp)
- 138:  e8 fc ff ff ff          call   139 <ib_uverbs_poll_cq+0x139>
-                       139: R_386_PC32 kfree
- 13d:  8b 54 24 04             mov    0x4(%esp),%edx
- 141:  89 e8                   mov    %ebp,%eax
- 143:  89 14 24                mov    %edx,(%esp)
- 146:  e8 fc ff ff ff          call   147 <ib_uverbs_poll_cq+0x147>
-                       147: R_386_PC32 kfree
- 14b:  8b 14 24                mov    (%esp),%edx
- 14e:  85 d2                   test   %edx,%edx
- 150:  0f 45 f2                cmovne %edx,%esi
- 153:  eb 0c                   jmp    161 <ib_uverbs_poll_cq+0x161>
- 155:  be f2 ff ff ff          mov    $0xfffffff2,%esi
- 15a:  eb 05                   jmp    161 <ib_uverbs_poll_cq+0x161>
- 15c:  be f4 ff ff ff          mov    $0xfffffff4,%esi
- 161:  83 c4 18                add    $0x18,%esp
- 164:  89 f0                   mov    %esi,%eax
- 166:  5b                      pop    %ebx
- 167:  5e                      pop    %esi
- 168:  5f                      pop    %edi
- 169:  5d                      pop    %ebp
- 16a:  c3                      ret
+   2:  bf f2 ff ff ff          mov    $0xfffffff2,%edi
+   7:  56                      push   %esi
+   8:  53                      push   %ebx
+   9:  89 c3                   mov    %eax,%ebx
+   b:  81 ec 84 00 00 00       sub    $0x84,%esp
+  11:  89 4c 24 04             mov    %ecx,0x4(%esp)
+  15:  8d 44 24 10             lea    0x10(%esp),%eax
+  19:  b9 10 00 00 00          mov    $0x10,%ecx
+  1e:  e8 fc ff ff ff          call   1f <ib_uverbs_poll_cq+0x1f>
+                       1f: R_386_PC32  copy_from_user
+  23:  89 04 24                mov    %eax,(%esp)
+  26:  85 c0                   test   %eax,%eax
+  28:  0f 85 1f 01 00 00       jne    14d <ib_uverbs_poll_cq+0x14d>
+  2e:  8b 4b 58                mov    0x58(%ebx),%ecx
+  31:  6a 00                   push   $0x0
+  33:  b8 00 00 00 00          mov    $0x0,%eax
+                       34: R_386_32    ib_uverbs_cq_idr
+  38:  bf ea ff ff ff          mov    $0xffffffea,%edi
+  3d:  8b 54 24 1c             mov    0x1c(%esp),%edx
+  41:  e8 fc ff ff ff          call   42 <ib_uverbs_poll_cq+0x42>
+                       42: R_386_PC32  .text.idr_read_obj
+  46:  89 c3                   mov    %eax,%ebx
+  48:  58                      pop    %eax
+  49:  85 db                   test   %ebx,%ebx
+  4b:  0f 84 fc 00 00 00       je     14d <ib_uverbs_poll_cq+0x14d>
+  51:  8b 6c 24 10             mov    0x10(%esp),%ebp
+  55:  b9 02 00 00 00          mov    $0x2,%ecx
+  5a:  8d 7c 24 08             lea    0x8(%esp),%edi
+  5e:  8b 04 24                mov    (%esp),%eax
+  61:  8d 75 08                lea    0x8(%ebp),%esi
+  64:  f3 ab                   rep stos %eax,%es:(%edi)
+  66:  8b 44 24 1c             mov    0x1c(%esp),%eax
+  6a:  39 44 24 08             cmp    %eax,0x8(%esp)
+  6e:  73 1f                   jae    8f <ib_uverbs_poll_cq+0x8f>
+  70:  8b 3b                   mov    (%ebx),%edi
+  72:  8d 4c 24 50             lea    0x50(%esp),%ecx
+  76:  ba 01 00 00 00          mov    $0x1,%edx
+  7b:  89 d8                   mov    %ebx,%eax
+  7d:  ff 97 70 01 00 00       call   *0x170(%edi)
+  83:  89 c7                   mov    %eax,%edi
+  85:  85 c0                   test   %eax,%eax
+  87:  0f 88 b9 00 00 00       js     146 <ib_uverbs_poll_cq+0x146>
+  8d:  75 21                   jne    b0 <ib_uverbs_poll_cq+0xb0>
+  8f:  b9 08 00 00 00          mov    $0x8,%ecx
+  94:  8d 54 24 08             lea    0x8(%esp),%edx
+  98:  89 e8                   mov    %ebp,%eax
+  9a:  bf f2 ff ff ff          mov    $0xfffffff2,%edi
+  9f:  e8 fc ff ff ff          call   a0 <ib_uverbs_poll_cq+0xa0>
+                       a0: R_386_PC32  copy_to_user
+  a4:  85 c0                   test   %eax,%eax
+  a6:  0f 44 7c 24 04          cmove  0x4(%esp),%edi
+  ab:  e9 96 00 00 00          jmp    146 <ib_uverbs_poll_cq+0x146>
+  b0:  8b 44 24 50             mov    0x50(%esp),%eax
+  b4:  8b 54 24 54             mov    0x54(%esp),%edx
+  b8:  b9 30 00 00 00          mov    $0x30,%ecx
+  bd:  89 44 24 20             mov    %eax,0x20(%esp)
+  c1:  8b 44 24 58             mov    0x58(%esp),%eax
+  c5:  89 54 24 24             mov    %edx,0x24(%esp)
+  c9:  8b 54 24 74             mov    0x74(%esp),%edx
+  cd:  89 44 24 28             mov    %eax,0x28(%esp)
+  d1:  8b 44 24 5c             mov    0x5c(%esp),%eax
+  d5:  89 44 24 2c             mov    %eax,0x2c(%esp)
+  d9:  8b 44 24 60             mov    0x60(%esp),%eax
+  dd:  89 44 24 30             mov    %eax,0x30(%esp)
+  e1:  8b 44 24 64             mov    0x64(%esp),%eax
+  e5:  89 44 24 34             mov    %eax,0x34(%esp)
+  e9:  8b 44 24 6c             mov    0x6c(%esp),%eax
+  ed:  89 44 24 38             mov    %eax,0x38(%esp)
+  f1:  8b 44 24 68             mov    0x68(%esp),%eax
+  f5:  8b 40 20                mov    0x20(%eax),%eax
+  f8:  89 54 24 44             mov    %edx,0x44(%esp)
+  fc:  8d 54 24 20             lea    0x20(%esp),%edx
+ 100:  c6 44 24 4f 00          movb   $0x0,0x4f(%esp)
+ 105:  89 44 24 3c             mov    %eax,0x3c(%esp)
+ 109:  8b 44 24 70             mov    0x70(%esp),%eax
+ 10d:  89 44 24 40             mov    %eax,0x40(%esp)
+ 111:  8b 44 24 78             mov    0x78(%esp),%eax
+ 115:  89 44 24 48             mov    %eax,0x48(%esp)
+ 119:  8b 44 24 7c             mov    0x7c(%esp),%eax
+ 11d:  66 89 44 24 4c          mov    %ax,0x4c(%esp)
+ 122:  8a 44 24 7e             mov    0x7e(%esp),%al
+ 126:  88 44 24 4e             mov    %al,0x4e(%esp)
+ 12a:  89 f0                   mov    %esi,%eax
+ 12c:  e8 fc ff ff ff          call   12d <ib_uverbs_poll_cq+0x12d>
+                       12d: R_386_PC32 copy_to_user
+ 131:  85 c0                   test   %eax,%eax
+ 133:  75 0c                   jne    141 <ib_uverbs_poll_cq+0x141>
+ 135:  83 c6 30                add    $0x30,%esi
+ 138:  ff 44 24 08             incl   0x8(%esp)
+ 13c:  e9 25 ff ff ff          jmp    66 <ib_uverbs_poll_cq+0x66>
+ 141:  bf f2 ff ff ff          mov    $0xfffffff2,%edi
+ 146:  89 d8                   mov    %ebx,%eax
+ 148:  e8 fc ff ff ff          call   149 <ib_uverbs_poll_cq+0x149>
+                       149: R_386_PC32 .text.put_cq_read
+ 14d:  81 c4 84 00 00 00       add    $0x84,%esp
+ 153:  89 f8                   mov    %edi,%eax
+ 155:  5b                      pop    %ebx
+ 156:  5e                      pop    %esi
+ 157:  5f                      pop    %edi
+ 158:  5d                      pop    %ebp
+ 159:  c3                      ret

 Disassembly of section .text.ib_uverbs_req_notify_cq:

@@ -1915,7 +1893,7 @@
   94:  b8 7b 00 00 00          mov    $0x7b,%eax
                        95: R_386_32    .text.ib_uverbs_create_qp
   99:  e8 fc ff ff ff          call   9a <ib_uverbs_create_qp+0x9a>
-                       9a: R_386_PC32  .text.trace_kmalloc.constprop.3
+                       9a: R_386_PC32  .text.trace_kmalloc.constprop.4
   9e:  59                      pop    %ecx
   9f:  c7 85 50 ff ff ff f4    movl   $0xfffffff4,-0xb0(%ebp)
   a6:  ff ff ff
@@ -2241,7 +2219,7 @@
   68:  b8 4f 00 00 00          mov    $0x4f,%eax
                        69: R_386_32    .text.ib_uverbs_query_qp
   6d:  e8 fc ff ff ff          call   6e <ib_uverbs_query_qp+0x6e>
-                       6e: R_386_PC32  .text.trace_kmalloc.constprop.3
+                       6e: R_386_PC32  .text.trace_kmalloc.constprop.4
   72:  59                      pop    %ecx
   73:  c7 85 5c ff ff ff 10    movl   $0x10,-0xa4(%ebp)
   7a:  00 00 00
@@ -2260,7 +2238,7 @@
   a3:  b8 86 00 00 00          mov    $0x86,%eax
                        a4: R_386_32    .text.ib_uverbs_query_qp
   a8:  e8 fc ff ff ff          call   a9 <ib_uverbs_query_qp+0xa9>
-                       a9: R_386_PC32  .text.trace_kmalloc.constprop.3
+                       a9: R_386_PC32  .text.trace_kmalloc.constprop.4
   ad:  5a                      pop    %edx
   ae:  85 db                   test   %ebx,%ebx
   b0:  0f 84 c1 01 00 00       je     277 <ib_uverbs_query_qp+0x277>
@@ -2462,7 +2440,7 @@
   88:  b8 6f 00 00 00          mov    $0x6f,%eax
                        89: R_386_32    .text.ib_uverbs_modify_qp
   8d:  e8 fc ff ff ff          call   8e <ib_uverbs_modify_qp+0x8e>
-                       8e: R_386_PC32  .text.trace_kmalloc.constprop.3
+                       8e: R_386_PC32  .text.trace_kmalloc.constprop.4
   92:  5a                      pop    %edx
   93:  ba f4 ff ff ff          mov    $0xfffffff4,%edx
   98:  85 db                   test   %ebx,%ebx
@@ -3129,7 +3107,7 @@
   6a:  b8 4c 00 00 00          mov    $0x4c,%eax
                        6b: R_386_32    .text.ib_uverbs_create_ah
   6f:  e8 fc ff ff ff          call   70 <ib_uverbs_create_ah+0x70>
-                       70: R_386_PC32  .text.trace_kmalloc.constprop.3
+                       70: R_386_PC32  .text.trace_kmalloc.constprop.4
   74:  58                      pop    %eax
   75:  85 db                   test   %ebx,%ebx
   77:  75 0a                   jne    83 <ib_uverbs_create_ah+0x83>
@@ -3396,7 +3374,7 @@
   af:  b8 91 00 00 00          mov    $0x91,%eax
                        b0: R_386_32    .text.ib_uverbs_attach_mcast
   b4:  e8 fc ff ff ff          call   b5 <ib_uverbs_attach_mcast+0xb5>
-                       b5: R_386_PC32  .text.trace_kmalloc.constprop.3
+                       b5: R_386_PC32  .text.trace_kmalloc.constprop.4
   b9:  58                      pop    %eax
   ba:  85 db                   test   %ebx,%ebx
   bc:  75 07                   jne    c5 <ib_uverbs_attach_mcast+0xc5>
@@ -3572,7 +3550,7 @@
   77:  b8 5e 00 00 00          mov    $0x5e,%eax
                        78: R_386_32    .text.ib_uverbs_create_srq
   7c:  e8 fc ff ff ff          call   7d <ib_uverbs_create_srq+0x7d>
-                       7d: R_386_PC32  .text.trace_kmalloc.constprop.3
+                       7d: R_386_PC32  .text.trace_kmalloc.constprop.4
   81:  ba f4 ff ff ff          mov    $0xfffffff4,%edx
   86:  58                      pop    %eax
   87:  85 db                   test   %ebx,%ebx

Needless to say, after my change the diff only shows the truly changed
functions.

- Michael

> 
>> +  suffix_counter = &clone_fn_ids->get_or_insert(decl_name);
>> +  ASM_FORMAT_PRIVATE_NAME (numbered_name, decl_name, *suffix_counter);
>> +  *suffix_counter = *suffix_counter + 1;
>> +  return clone_function_name_1 (numbered_name, suffix);
>>  }
>>
>>
>> diff --git gcc/testsuite/gcc.dg/independent-cloneids-1.c 
>> gcc/testsuite/gcc.dg/independent-cloneids-1.c
>> new file mode 100644
>> index 0000000..d723e20
>> --- /dev/null
>> +++ gcc/testsuite/gcc.dg/independent-cloneids-1.c
>> @@ -0,0 +1,38 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O3 -fipa-cp -fipa-cp-clone -fdump-ipa-cp"  } */
>> +
>> +extern int printf (const char *, ...);
>> +
>> +static int __attribute__ ((noinline))
>> +foo (int arg)
>> +{
>> +  return 7 * arg;
>> +}
>> +
>> +static int __attribute__ ((noinline))
>> +bar (int arg)
>> +{
>> +  return arg * arg;
>> +}
>> +
>> +int
>> +baz (int arg)
>> +{
>> +  printf("%d\n", bar (3));
>> +  printf("%d\n", bar (4));
>> +  printf("%d\n", foo (5));
>> +  printf("%d\n", foo (6));
>> +  /* adding or removing the following call should not affect foo
>> +     function's clone numbering */
>> +  printf("%d\n", bar (7));
>> +  return foo (8);
>> +}
>> +
>> +/* { dg-final { scan-ipa-dump "Function bar.constprop.0" "cp" } } */
>> +/* { dg-final { scan-ipa-dump "Function bar.constprop.1" "cp" } } */
>> +/* { dg-final { scan-ipa-dump "Function bar.constprop.3" "cp" } } */
>> +/* { dg-final { scan-ipa-dump "Function foo.constprop.0" "cp" } } */
>> +/* { dg-final { scan-ipa-dump "Function foo.constprop.1" "cp" } } */
>> +/* { dg-final { scan-ipa-dump "Function foo.constprop.2" "cp" } } */
>> +/* { dg-final { scan-ipa-dump-not "Function foo.constprop.3" "cp" } } */
>> +/* { dg-final { scan-ipa-dump-not "Function foo.constprop.4" "cp" } } */
>> --
>> 2.7.4
>>


Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to