Re: Question about optimizing function pointers for direct function calls
Hi Richard, Thanks for your reply. I'm looking forward to hearing from Honza! But I still have a small question about the IPA passes. If I want to make such a conversion (sum += foo[i].add == add ? add (1,2) : foo[i].add (1,2);), I definitely need to change the function's body. But the article "Using summary information in IPA passes"(https://gcc.gnu.org/onlinedocs/gccint/link-time-optimization/using-summary- Information-in-ipa-passes.html) mentioned that if IPA-Pass needs to change the function body, it needs to be done in the transform stage, but the execution stage of IPA-inline has ended before the transform stage. So we still can’t mark the newly created function calls/cgraph_edge as inlineable. Is that correct? Or did I understand something wrong? Thanks Hanke Zhang Richard Biener 于2024年6月12日周三 18:49写道: > > On Wed, Jun 12, 2024 at 11:57 AM Hanke Zhang wrote: > > > > Richard Biener 于2024年5月24日周五 14:39写道: > > > > > > On Fri, May 24, 2024 at 5:53 AM Hanke Zhang via Gcc > > > wrote: > > > > > > > > Hi, > > > > I got a question about optimizing function pointers for direct > > > > function calls in C. > > > > > > > > Consider the following scenario: one of the fields of a structure is a > > > > function pointer, and all its assignments come from the same function. > > > > Can all its uses be replaced by direct calls to this function? So the > > > > later passes can do more optimizations. > > > > > > > > Here is the example: > > > > > > > > int add(int a, int b) { return a + b; } > > > > int sub(int a, int b) { return a - b; } > > > > > > > > struct Foo { > > > > int (*add)(int, int); > > > > }; > > > > int main() > > > > { > > > > struct Foo foo[5] = malloc(sizeof(struct Foo) * 5); > > > > > > > > for (int i = 0; i < 5; i++) { > > > > foo[i].add = add; > > > > } > > > > > > > > int sum = 0; > > > > for (int i = 0; i < 5; i++) { > > > > sum += foo[i].add(1, 2); > > > > } > > > > > > > > return 0; > > > > } > > > > > > > > Can I replace the code above to the code below? > > > > > > > > int add(int a, int b) { return a + b; } > > > > int sub(int a, int b) { return a - b; } > > > > > > > > struct Foo { > > > > int (*add)(int, int); > > > > }; > > > > int main() > > > > { > > > > struct Foo foo[5] = malloc(sizeof(struct Foo) * 5); > > > > > > > > for (int i = 0; i < 5; i++) { > > > > foo[i].add = add; > > > > } > > > > > > > > int sum = 0; > > > > for (int i = 0; i < 5; i++) { > > > > sum += add(1,2); > > > > } > > > > > > > > return 0; > > > > } > > > > > > > > My idea is to determine whether the assignment of the field is the > > > > same function, and if so, perform the replacement. > > > > > > If it's as simple as above then sure, even CSE should do it. If you > > > can prove otherwise the memory location with the function pointer > > > always has the same value you are obviously fine. If you just > > > do not see any other store via 'add's FIELD_DECL then no, that > > > isn't good enough. Every void * store you do not know where it > > > goes might go to that slot. > > > > > > > Of course this is not a reasonable optimization, I just want to know > > > > if there are security issues in doing so, and if I want to do it in > > > > the IPA stage, is it possible? > > > > > > For the more general case you can do what we do for speculative > > > devirtualization - replace the code with > > > > > > sum += foo[i].add == add ? add (1,2) : foo[i].add(1,2); > > > > Hi Richard, > > > > I'm trying to do what you suggested. (sum += foo[i].add == add ? add > > (1,2) : foo[i].add(1,2);) > > > > I created a new IPA-Pass before IPA-INLINE and I made the changes on > > the GIMPLE in the "function_transform" stage. But my newly created > > "foo[i].add(1,2)" seems to fail to be inlined. And it was optimized > > out in the subsequent FRE. I would like to ask if there is any way to > > mark my newly created
Re: Question about optimizing function pointers for direct function calls
On Wed, Jun 12, 2024 at 11:57 AM Hanke Zhang wrote: > > Richard Biener 于2024年5月24日周五 14:39写道: > > > > On Fri, May 24, 2024 at 5:53 AM Hanke Zhang via Gcc wrote: > > > > > > Hi, > > > I got a question about optimizing function pointers for direct > > > function calls in C. > > > > > > Consider the following scenario: one of the fields of a structure is a > > > function pointer, and all its assignments come from the same function. > > > Can all its uses be replaced by direct calls to this function? So the > > > later passes can do more optimizations. > > > > > > Here is the example: > > > > > > int add(int a, int b) { return a + b; } > > > int sub(int a, int b) { return a - b; } > > > > > > struct Foo { > > > int (*add)(int, int); > > > }; > > > int main() > > > { > > > struct Foo foo[5] = malloc(sizeof(struct Foo) * 5); > > > > > > for (int i = 0; i < 5; i++) { > > > foo[i].add = add; > > > } > > > > > > int sum = 0; > > > for (int i = 0; i < 5; i++) { > > > sum += foo[i].add(1, 2); > > > } > > > > > > return 0; > > > } > > > > > > Can I replace the code above to the code below? > > > > > > int add(int a, int b) { return a + b; } > > > int sub(int a, int b) { return a - b; } > > > > > > struct Foo { > > > int (*add)(int, int); > > > }; > > > int main() > > > { > > > struct Foo foo[5] = malloc(sizeof(struct Foo) * 5); > > > > > > for (int i = 0; i < 5; i++) { > > > foo[i].add = add; > > > } > > > > > > int sum = 0; > > > for (int i = 0; i < 5; i++) { > > > sum += add(1,2); > > > } > > > > > > return 0; > > > } > > > > > > My idea is to determine whether the assignment of the field is the > > > same function, and if so, perform the replacement. > > > > If it's as simple as above then sure, even CSE should do it. If you > > can prove otherwise the memory location with the function pointer > > always has the same value you are obviously fine. If you just > > do not see any other store via 'add's FIELD_DECL then no, that > > isn't good enough. Every void * store you do not know where it > > goes might go to that slot. > > > > > Of course this is not a reasonable optimization, I just want to know > > > if there are security issues in doing so, and if I want to do it in > > > the IPA stage, is it possible? > > > > For the more general case you can do what we do for speculative > > devirtualization - replace the code with > > > > sum += foo[i].add == add ? add (1,2) : foo[i].add(1,2); > > Hi Richard, > > I'm trying to do what you suggested. (sum += foo[i].add == add ? add > (1,2) : foo[i].add(1,2);) > > I created a new IPA-Pass before IPA-INLINE and I made the changes on > the GIMPLE in the "function_transform" stage. But my newly created > "foo[i].add(1,2)" seems to fail to be inlined. And it was optimized > out in the subsequent FRE. I would like to ask if there is any way to > mark my newly created cgraph_edge as "inline" in the > "function_transform" stage. > > Here is part of the code creating the call_stmt in true basic_block in > my IPA-PASS: > > // ... > // part of the code have been omitted > auto_vec params; > for (int i = 0; i < gimple_call_num_args (call_stmt); i++) { > params.safe_push (gimple_call_arg (call_stmt, i)); > } > gimple* tbb_call = gimple_build_call_vec (fndecl, params); /// = fn() > tree tbb_ssa; > if (ret_val_flag) { > tbb_ssa = make_ssa_name (TREE_TYPE (gimple_call_lhs(call_stmt)), NULL); > gimple_call_set_lhs (tbb_call, tbb_ssa); /// _2 = fn() > } > gsi = gsi_start_bb (tbb); > gsi_insert_before (, tbb_call, GSI_SAME_STMT); > cgraph_edge* tbb_callee = node->create_edge (cgraph_node::get_create > (fndecl), (gcall*)tbb_call, tbb_call->bb->count, true); > // what should I do to this cgraph_edge to mark it to be inlined > // ... > > Or should I not do these things in the function_transform stage? That's indeed too late. You have to create speculative call edges, I would suggest to look how IPA devirtualization handles this case and possibly simply amend it. I'm CCing Honza who should know more about this, I do not know the details. Richard. > Thanks > Hanke Zhang > > > > > > that way we can inline the direct call and hopefully the branch will be > > well predicted. > > > > In some SPEC there is IIRC the situation where such speculative > > devirtualization candidates can be found solely based on function > > signature. With LTO/IPA you'd basically collect candidate targets > > for each indirect call (possibly address-taken function definitions > > with correct signature) and if there's just a single one you can > > choose that as speculative devirt target. > > > > Speculative devirt as we have now of course works with profile > > data to identify the most probable candidate. > > > > Richard. > > > > > > > > Thanks > > > Hanke Zhang
Re: Question about optimizing function pointers for direct function calls
Richard Biener 于2024年5月24日周五 14:39写道: > > On Fri, May 24, 2024 at 5:53 AM Hanke Zhang via Gcc wrote: > > > > Hi, > > I got a question about optimizing function pointers for direct > > function calls in C. > > > > Consider the following scenario: one of the fields of a structure is a > > function pointer, and all its assignments come from the same function. > > Can all its uses be replaced by direct calls to this function? So the > > later passes can do more optimizations. > > > > Here is the example: > > > > int add(int a, int b) { return a + b; } > > int sub(int a, int b) { return a - b; } > > > > struct Foo { > > int (*add)(int, int); > > }; > > int main() > > { > > struct Foo foo[5] = malloc(sizeof(struct Foo) * 5); > > > > for (int i = 0; i < 5; i++) { > > foo[i].add = add; > > } > > > > int sum = 0; > > for (int i = 0; i < 5; i++) { > > sum += foo[i].add(1, 2); > > } > > > > return 0; > > } > > > > Can I replace the code above to the code below? > > > > int add(int a, int b) { return a + b; } > > int sub(int a, int b) { return a - b; } > > > > struct Foo { > > int (*add)(int, int); > > }; > > int main() > > { > > struct Foo foo[5] = malloc(sizeof(struct Foo) * 5); > > > > for (int i = 0; i < 5; i++) { > > foo[i].add = add; > > } > > > > int sum = 0; > > for (int i = 0; i < 5; i++) { > > sum += add(1,2); > > } > > > > return 0; > > } > > > > My idea is to determine whether the assignment of the field is the > > same function, and if so, perform the replacement. > > If it's as simple as above then sure, even CSE should do it. If you > can prove otherwise the memory location with the function pointer > always has the same value you are obviously fine. If you just > do not see any other store via 'add's FIELD_DECL then no, that > isn't good enough. Every void * store you do not know where it > goes might go to that slot. > > > Of course this is not a reasonable optimization, I just want to know > > if there are security issues in doing so, and if I want to do it in > > the IPA stage, is it possible? > > For the more general case you can do what we do for speculative > devirtualization - replace the code with > > sum += foo[i].add == add ? add (1,2) : foo[i].add(1,2); Hi Richard, I'm trying to do what you suggested. (sum += foo[i].add == add ? add (1,2) : foo[i].add(1,2);) I created a new IPA-Pass before IPA-INLINE and I made the changes on the GIMPLE in the "function_transform" stage. But my newly created "foo[i].add(1,2)" seems to fail to be inlined. And it was optimized out in the subsequent FRE. I would like to ask if there is any way to mark my newly created cgraph_edge as "inline" in the "function_transform" stage. Here is part of the code creating the call_stmt in true basic_block in my IPA-PASS: // ... // part of the code have been omitted auto_vec params; for (int i = 0; i < gimple_call_num_args (call_stmt); i++) { params.safe_push (gimple_call_arg (call_stmt, i)); } gimple* tbb_call = gimple_build_call_vec (fndecl, params); /// = fn() tree tbb_ssa; if (ret_val_flag) { tbb_ssa = make_ssa_name (TREE_TYPE (gimple_call_lhs(call_stmt)), NULL); gimple_call_set_lhs (tbb_call, tbb_ssa); /// _2 = fn() } gsi = gsi_start_bb (tbb); gsi_insert_before (, tbb_call, GSI_SAME_STMT); cgraph_edge* tbb_callee = node->create_edge (cgraph_node::get_create (fndecl), (gcall*)tbb_call, tbb_call->bb->count, true); // what should I do to this cgraph_edge to mark it to be inlined // ... Or should I not do these things in the function_transform stage? Thanks Hanke Zhang > > that way we can inline the direct call and hopefully the branch will be > well predicted. > > In some SPEC there is IIRC the situation where such speculative > devirtualization candidates can be found solely based on function > signature. With LTO/IPA you'd basically collect candidate targets > for each indirect call (possibly address-taken function definitions > with correct signature) and if there's just a single one you can > choose that as speculative devirt target. > > Speculative devirt as we have now of course works with profile > data to identify the most probable candidate. > > Richard. > > > > > Thanks > > Hanke Zhang
Re: Question about optimizing function pointers for direct function calls
On 5/23/24 9:51 PM, Hanke Zhang via Gcc wrote: Hi, I got a question about optimizing function pointers for direct function calls in C. Consider the following scenario: one of the fields of a structure is a function pointer, and all its assignments come from the same function. Can all its uses be replaced by direct calls to this function? So the later passes can do more optimizations. Certainly. The RTL optimizers have been doing this for ~30 years. If they can statically determine the target of an indirect call, they will replace that indirect call with a simple direct call. You're just extending that capability to apply more often, likely by doing it at an earlier stage in the pipeline, presumably in one of the IPA or gimple optimizers? Note that by doing this transformation before the gimple->rtl conversion step, you don't have to worry about quirky ABIs where direct and indirect calls can have different calling conventions. Jeff
Re: Question about optimizing function pointers for direct function calls
On Fri, May 24, 2024 at 5:53 AM Hanke Zhang via Gcc wrote: > > Hi, > I got a question about optimizing function pointers for direct > function calls in C. > > Consider the following scenario: one of the fields of a structure is a > function pointer, and all its assignments come from the same function. > Can all its uses be replaced by direct calls to this function? So the > later passes can do more optimizations. > > Here is the example: > > int add(int a, int b) { return a + b; } > int sub(int a, int b) { return a - b; } > > struct Foo { > int (*add)(int, int); > }; > int main() > { > struct Foo foo[5] = malloc(sizeof(struct Foo) * 5); > > for (int i = 0; i < 5; i++) { > foo[i].add = add; > } > > int sum = 0; > for (int i = 0; i < 5; i++) { > sum += foo[i].add(1, 2); > } > > return 0; > } > > Can I replace the code above to the code below? > > int add(int a, int b) { return a + b; } > int sub(int a, int b) { return a - b; } > > struct Foo { > int (*add)(int, int); > }; > int main() > { > struct Foo foo[5] = malloc(sizeof(struct Foo) * 5); > > for (int i = 0; i < 5; i++) { > foo[i].add = add; > } > > int sum = 0; > for (int i = 0; i < 5; i++) { > sum += add(1,2); > } > > return 0; > } > > My idea is to determine whether the assignment of the field is the > same function, and if so, perform the replacement. If it's as simple as above then sure, even CSE should do it. If you can prove otherwise the memory location with the function pointer always has the same value you are obviously fine. If you just do not see any other store via 'add's FIELD_DECL then no, that isn't good enough. Every void * store you do not know where it goes might go to that slot. > Of course this is not a reasonable optimization, I just want to know > if there are security issues in doing so, and if I want to do it in > the IPA stage, is it possible? For the more general case you can do what we do for speculative devirtualization - replace the code with sum += foo[i].add == add ? add (1,2) : foo[i].add(1,2); that way we can inline the direct call and hopefully the branch will be well predicted. In some SPEC there is IIRC the situation where such speculative devirtualization candidates can be found solely based on function signature. With LTO/IPA you'd basically collect candidate targets for each indirect call (possibly address-taken function definitions with correct signature) and if there's just a single one you can choose that as speculative devirt target. Speculative devirt as we have now of course works with profile data to identify the most probable candidate. Richard. > > Thanks > Hanke Zhang
Question about optimizing function pointers for direct function calls
Hi, I got a question about optimizing function pointers for direct function calls in C. Consider the following scenario: one of the fields of a structure is a function pointer, and all its assignments come from the same function. Can all its uses be replaced by direct calls to this function? So the later passes can do more optimizations. Here is the example: int add(int a, int b) { return a + b; } int sub(int a, int b) { return a - b; } struct Foo { int (*add)(int, int); }; int main() { struct Foo foo[5] = malloc(sizeof(struct Foo) * 5); for (int i = 0; i < 5; i++) { foo[i].add = add; } int sum = 0; for (int i = 0; i < 5; i++) { sum += foo[i].add(1, 2); } return 0; } Can I replace the code above to the code below? int add(int a, int b) { return a + b; } int sub(int a, int b) { return a - b; } struct Foo { int (*add)(int, int); }; int main() { struct Foo foo[5] = malloc(sizeof(struct Foo) * 5); for (int i = 0; i < 5; i++) { foo[i].add = add; } int sum = 0; for (int i = 0; i < 5; i++) { sum += add(1,2); } return 0; } My idea is to determine whether the assignment of the field is the same function, and if so, perform the replacement. Of course this is not a reasonable optimization, I just want to know if there are security issues in doing so, and if I want to do it in the IPA stage, is it possible? Thanks Hanke Zhang