Re: Question about optimizing function pointers for direct function calls

2024-06-12 Thread Hanke Zhang via Gcc
Hi Richard,

Thanks for your reply. I'm looking forward to hearing from Honza!

But I still have a small question about the IPA passes. If I want to
make such a conversion (sum += foo[i].add == add ? add (1,2) :
foo[i].add (1,2);), I definitely need to change the function's body.
But the article "Using summary information in IPA
passes"(https://gcc.gnu.org/onlinedocs/gccint/link-time-optimization/using-summary-
Information-in-ipa-passes.html) mentioned that if IPA-Pass needs to
change the function body, it needs to be done in the transform stage,
but the execution stage of IPA-inline has ended before the transform
stage. So we still can’t mark the newly created function
calls/cgraph_edge as inlineable. Is that correct?

Or did I understand something wrong?

Thanks
Hanke Zhang

Richard Biener  于2024年6月12日周三 18:49写道:
>
> On Wed, Jun 12, 2024 at 11:57 AM Hanke Zhang  wrote:
> >
> > Richard Biener  于2024年5月24日周五 14:39写道:
> > >
> > > On Fri, May 24, 2024 at 5:53 AM Hanke Zhang via Gcc  
> > > wrote:
> > > >
> > > > Hi,
> > > > I got a question about optimizing function pointers for direct
> > > > function calls in C.
> > > >
> > > > Consider the following scenario: one of the fields of a structure is a
> > > > function pointer, and all its assignments come from the same function.
> > > > Can all its uses be replaced by direct calls to this function? So the
> > > > later passes can do more optimizations.
> > > >
> > > > Here is the example:
> > > >
> > > > int add(int a, int b) { return a + b; }
> > > > int sub(int a, int b) { return a - b; }
> > > >
> > > > struct Foo {
> > > > int (*add)(int, int);
> > > > };
> > > > int main()
> > > > {
> > > > struct Foo foo[5] = malloc(sizeof(struct Foo) * 5);
> > > >
> > > > for (int i = 0; i < 5; i++) {
> > > > foo[i].add = add;
> > > > }
> > > >
> > > > int sum = 0;
> > > > for (int i = 0; i < 5; i++) {
> > > > sum += foo[i].add(1, 2);
> > > > }
> > > >
> > > > return 0;
> > > > }
> > > >
> > > > Can I replace the code above to the code below?
> > > >
> > > > int add(int a, int b) { return a + b; }
> > > > int sub(int a, int b) { return a - b; }
> > > >
> > > > struct Foo {
> > > > int (*add)(int, int);
> > > > };
> > > > int main()
> > > > {
> > > > struct Foo foo[5] = malloc(sizeof(struct Foo) * 5);
> > > >
> > > > for (int i = 0; i < 5; i++) {
> > > > foo[i].add = add;
> > > > }
> > > >
> > > > int sum = 0;
> > > > for (int i = 0; i < 5; i++) {
> > > > sum += add(1,2);
> > > > }
> > > >
> > > > return 0;
> > > > }
> > > >
> > > > My idea is to determine whether the assignment of the field is the
> > > > same function, and if so, perform the replacement.
> > >
> > > If it's as simple as above then sure, even CSE should do it.  If you
> > > can prove otherwise the memory location with the function pointer
> > > always has the same value you are obviously fine.  If you just
> > > do not see any other store via 'add's FIELD_DECL then no, that
> > > isn't good enough.  Every void * store you do not know where it
> > > goes might go to that slot.
> > >
> > > > Of course this is not a reasonable optimization, I just want to know
> > > > if there are security issues in doing so, and if I want to do it in
> > > > the IPA stage, is it possible?
> > >
> > > For the more general case you can do what we do for speculative
> > > devirtualization - replace the code with
> > >
> > >   sum += foo[i].add == add ? add (1,2) : foo[i].add(1,2);
> >
> > Hi Richard,
> >
> > I'm trying to do what you suggested. (sum += foo[i].add == add ? add
> > (1,2) : foo[i].add(1,2);)
> >
> > I created a new IPA-Pass before IPA-INLINE and I made the changes on
> > the GIMPLE in the "function_transform" stage. But my newly created
> > "foo[i].add(1,2)" seems to fail to be inlined. And it was optimized
> > out in the subsequent FRE. I would like to ask if there is any way to
> > mark my newly created

Re: Question about optimizing function pointers for direct function calls

2024-06-12 Thread Richard Biener via Gcc
On Wed, Jun 12, 2024 at 11:57 AM Hanke Zhang  wrote:
>
> Richard Biener  于2024年5月24日周五 14:39写道:
> >
> > On Fri, May 24, 2024 at 5:53 AM Hanke Zhang via Gcc  wrote:
> > >
> > > Hi,
> > > I got a question about optimizing function pointers for direct
> > > function calls in C.
> > >
> > > Consider the following scenario: one of the fields of a structure is a
> > > function pointer, and all its assignments come from the same function.
> > > Can all its uses be replaced by direct calls to this function? So the
> > > later passes can do more optimizations.
> > >
> > > Here is the example:
> > >
> > > int add(int a, int b) { return a + b; }
> > > int sub(int a, int b) { return a - b; }
> > >
> > > struct Foo {
> > > int (*add)(int, int);
> > > };
> > > int main()
> > > {
> > > struct Foo foo[5] = malloc(sizeof(struct Foo) * 5);
> > >
> > > for (int i = 0; i < 5; i++) {
> > > foo[i].add = add;
> > > }
> > >
> > > int sum = 0;
> > > for (int i = 0; i < 5; i++) {
> > > sum += foo[i].add(1, 2);
> > > }
> > >
> > > return 0;
> > > }
> > >
> > > Can I replace the code above to the code below?
> > >
> > > int add(int a, int b) { return a + b; }
> > > int sub(int a, int b) { return a - b; }
> > >
> > > struct Foo {
> > > int (*add)(int, int);
> > > };
> > > int main()
> > > {
> > > struct Foo foo[5] = malloc(sizeof(struct Foo) * 5);
> > >
> > > for (int i = 0; i < 5; i++) {
> > > foo[i].add = add;
> > > }
> > >
> > > int sum = 0;
> > > for (int i = 0; i < 5; i++) {
> > > sum += add(1,2);
> > > }
> > >
> > > return 0;
> > > }
> > >
> > > My idea is to determine whether the assignment of the field is the
> > > same function, and if so, perform the replacement.
> >
> > If it's as simple as above then sure, even CSE should do it.  If you
> > can prove otherwise the memory location with the function pointer
> > always has the same value you are obviously fine.  If you just
> > do not see any other store via 'add's FIELD_DECL then no, that
> > isn't good enough.  Every void * store you do not know where it
> > goes might go to that slot.
> >
> > > Of course this is not a reasonable optimization, I just want to know
> > > if there are security issues in doing so, and if I want to do it in
> > > the IPA stage, is it possible?
> >
> > For the more general case you can do what we do for speculative
> > devirtualization - replace the code with
> >
> >   sum += foo[i].add == add ? add (1,2) : foo[i].add(1,2);
>
> Hi Richard,
>
> I'm trying to do what you suggested. (sum += foo[i].add == add ? add
> (1,2) : foo[i].add(1,2);)
>
> I created a new IPA-Pass before IPA-INLINE and I made the changes on
> the GIMPLE in the "function_transform" stage. But my newly created
> "foo[i].add(1,2)" seems to fail to be inlined. And it was optimized
> out in the subsequent FRE. I would like to ask if there is any way to
> mark my newly created cgraph_edge as "inline" in the
> "function_transform" stage.
>
> Here is part of the code creating the call_stmt in true basic_block in
> my IPA-PASS:
>
> // ...
> // part of the code have been omitted
> auto_vec params;
> for (int i = 0; i < gimple_call_num_args (call_stmt); i++) {
>   params.safe_push (gimple_call_arg (call_stmt, i));
> }
> gimple* tbb_call = gimple_build_call_vec (fndecl, params); /// = fn()
> tree tbb_ssa;
> if (ret_val_flag) {
>   tbb_ssa = make_ssa_name (TREE_TYPE (gimple_call_lhs(call_stmt)), NULL);
>   gimple_call_set_lhs (tbb_call, tbb_ssa); /// _2 = fn()
> }
> gsi = gsi_start_bb (tbb);
> gsi_insert_before (, tbb_call, GSI_SAME_STMT);
> cgraph_edge* tbb_callee = node->create_edge (cgraph_node::get_create
> (fndecl), (gcall*)tbb_call, tbb_call->bb->count, true);
> // what should I do to this cgraph_edge to mark it to be inlined
> // ...
>
> Or should I not do these things in the function_transform stage?

That's indeed too late.  You have to create speculative call edges,
I would suggest to look how IPA devirtualization handles this case
and possibly simply amend it.  I'm CCing Honza who should know
more about this, I do not know the details.

Richard.

> Thanks
> Hanke Zhang
>
>
> >
> > that way we can inline the direct call and hopefully the branch will be
> > well predicted.
> >
> > In some SPEC there is IIRC the situation where such speculative
> > devirtualization candidates can be found solely based on function
> > signature.  With LTO/IPA you'd basically collect candidate targets
> > for each indirect call (possibly address-taken function definitions
> > with correct signature) and if there's just a single one you can
> > choose that as speculative devirt target.
> >
> > Speculative devirt as we have now of course works with profile
> > data to identify the most probable candidate.
> >
> > Richard.
> >
> > >
> > > Thanks
> > > Hanke Zhang


Re: Question about optimizing function pointers for direct function calls

2024-06-12 Thread Hanke Zhang via Gcc
Richard Biener  于2024年5月24日周五 14:39写道:
>
> On Fri, May 24, 2024 at 5:53 AM Hanke Zhang via Gcc  wrote:
> >
> > Hi,
> > I got a question about optimizing function pointers for direct
> > function calls in C.
> >
> > Consider the following scenario: one of the fields of a structure is a
> > function pointer, and all its assignments come from the same function.
> > Can all its uses be replaced by direct calls to this function? So the
> > later passes can do more optimizations.
> >
> > Here is the example:
> >
> > int add(int a, int b) { return a + b; }
> > int sub(int a, int b) { return a - b; }
> >
> > struct Foo {
> > int (*add)(int, int);
> > };
> > int main()
> > {
> > struct Foo foo[5] = malloc(sizeof(struct Foo) * 5);
> >
> > for (int i = 0; i < 5; i++) {
> > foo[i].add = add;
> > }
> >
> > int sum = 0;
> > for (int i = 0; i < 5; i++) {
> > sum += foo[i].add(1, 2);
> > }
> >
> > return 0;
> > }
> >
> > Can I replace the code above to the code below?
> >
> > int add(int a, int b) { return a + b; }
> > int sub(int a, int b) { return a - b; }
> >
> > struct Foo {
> > int (*add)(int, int);
> > };
> > int main()
> > {
> > struct Foo foo[5] = malloc(sizeof(struct Foo) * 5);
> >
> > for (int i = 0; i < 5; i++) {
> > foo[i].add = add;
> > }
> >
> > int sum = 0;
> > for (int i = 0; i < 5; i++) {
> > sum += add(1,2);
> > }
> >
> > return 0;
> > }
> >
> > My idea is to determine whether the assignment of the field is the
> > same function, and if so, perform the replacement.
>
> If it's as simple as above then sure, even CSE should do it.  If you
> can prove otherwise the memory location with the function pointer
> always has the same value you are obviously fine.  If you just
> do not see any other store via 'add's FIELD_DECL then no, that
> isn't good enough.  Every void * store you do not know where it
> goes might go to that slot.
>
> > Of course this is not a reasonable optimization, I just want to know
> > if there are security issues in doing so, and if I want to do it in
> > the IPA stage, is it possible?
>
> For the more general case you can do what we do for speculative
> devirtualization - replace the code with
>
>   sum += foo[i].add == add ? add (1,2) : foo[i].add(1,2);

Hi Richard,

I'm trying to do what you suggested. (sum += foo[i].add == add ? add
(1,2) : foo[i].add(1,2);)

I created a new IPA-Pass before IPA-INLINE and I made the changes on
the GIMPLE in the "function_transform" stage. But my newly created
"foo[i].add(1,2)" seems to fail to be inlined. And it was optimized
out in the subsequent FRE. I would like to ask if there is any way to
mark my newly created cgraph_edge as "inline" in the
"function_transform" stage.

Here is part of the code creating the call_stmt in true basic_block in
my IPA-PASS:

// ...
// part of the code have been omitted
auto_vec params;
for (int i = 0; i < gimple_call_num_args (call_stmt); i++) {
  params.safe_push (gimple_call_arg (call_stmt, i));
}
gimple* tbb_call = gimple_build_call_vec (fndecl, params); /// = fn()
tree tbb_ssa;
if (ret_val_flag) {
  tbb_ssa = make_ssa_name (TREE_TYPE (gimple_call_lhs(call_stmt)), NULL);
  gimple_call_set_lhs (tbb_call, tbb_ssa); /// _2 = fn()
}
gsi = gsi_start_bb (tbb);
gsi_insert_before (, tbb_call, GSI_SAME_STMT);
cgraph_edge* tbb_callee = node->create_edge (cgraph_node::get_create
(fndecl), (gcall*)tbb_call, tbb_call->bb->count, true);
// what should I do to this cgraph_edge to mark it to be inlined
// ...

Or should I not do these things in the function_transform stage?

Thanks
Hanke Zhang


>
> that way we can inline the direct call and hopefully the branch will be
> well predicted.
>
> In some SPEC there is IIRC the situation where such speculative
> devirtualization candidates can be found solely based on function
> signature.  With LTO/IPA you'd basically collect candidate targets
> for each indirect call (possibly address-taken function definitions
> with correct signature) and if there's just a single one you can
> choose that as speculative devirt target.
>
> Speculative devirt as we have now of course works with profile
> data to identify the most probable candidate.
>
> Richard.
>
> >
> > Thanks
> > Hanke Zhang


Re: Question about optimizing function pointers for direct function calls

2024-05-24 Thread Jeff Law via Gcc




On 5/23/24 9:51 PM, Hanke Zhang via Gcc wrote:

Hi,
I got a question about optimizing function pointers for direct
function calls in C.

Consider the following scenario: one of the fields of a structure is a
function pointer, and all its assignments come from the same function.
Can all its uses be replaced by direct calls to this function? So the
later passes can do more optimizations.
Certainly.  The RTL optimizers have been doing this for ~30 years.  If 
they can statically determine the target of an indirect call, they will 
replace that indirect call with a simple direct call.


You're just extending that capability to apply more often, likely by 
doing it at an earlier stage in the pipeline, presumably in one of the 
IPA or gimple optimizers?



Note that by doing this transformation before the gimple->rtl conversion 
step, you don't have to worry about quirky ABIs where direct and 
indirect calls can have different calling conventions.


Jeff


Re: Question about optimizing function pointers for direct function calls

2024-05-24 Thread Richard Biener via Gcc
On Fri, May 24, 2024 at 5:53 AM Hanke Zhang via Gcc  wrote:
>
> Hi,
> I got a question about optimizing function pointers for direct
> function calls in C.
>
> Consider the following scenario: one of the fields of a structure is a
> function pointer, and all its assignments come from the same function.
> Can all its uses be replaced by direct calls to this function? So the
> later passes can do more optimizations.
>
> Here is the example:
>
> int add(int a, int b) { return a + b; }
> int sub(int a, int b) { return a - b; }
>
> struct Foo {
> int (*add)(int, int);
> };
> int main()
> {
> struct Foo foo[5] = malloc(sizeof(struct Foo) * 5);
>
> for (int i = 0; i < 5; i++) {
> foo[i].add = add;
> }
>
> int sum = 0;
> for (int i = 0; i < 5; i++) {
> sum += foo[i].add(1, 2);
> }
>
> return 0;
> }
>
> Can I replace the code above to the code below?
>
> int add(int a, int b) { return a + b; }
> int sub(int a, int b) { return a - b; }
>
> struct Foo {
> int (*add)(int, int);
> };
> int main()
> {
> struct Foo foo[5] = malloc(sizeof(struct Foo) * 5);
>
> for (int i = 0; i < 5; i++) {
> foo[i].add = add;
> }
>
> int sum = 0;
> for (int i = 0; i < 5; i++) {
> sum += add(1,2);
> }
>
> return 0;
> }
>
> My idea is to determine whether the assignment of the field is the
> same function, and if so, perform the replacement.

If it's as simple as above then sure, even CSE should do it.  If you
can prove otherwise the memory location with the function pointer
always has the same value you are obviously fine.  If you just
do not see any other store via 'add's FIELD_DECL then no, that
isn't good enough.  Every void * store you do not know where it
goes might go to that slot.

> Of course this is not a reasonable optimization, I just want to know
> if there are security issues in doing so, and if I want to do it in
> the IPA stage, is it possible?

For the more general case you can do what we do for speculative
devirtualization - replace the code with

  sum += foo[i].add == add ? add (1,2) : foo[i].add(1,2);

that way we can inline the direct call and hopefully the branch will be
well predicted.

In some SPEC there is IIRC the situation where such speculative
devirtualization candidates can be found solely based on function
signature.  With LTO/IPA you'd basically collect candidate targets
for each indirect call (possibly address-taken function definitions
with correct signature) and if there's just a single one you can
choose that as speculative devirt target.

Speculative devirt as we have now of course works with profile
data to identify the most probable candidate.

Richard.

>
> Thanks
> Hanke Zhang


Question about optimizing function pointers for direct function calls

2024-05-23 Thread Hanke Zhang via Gcc
Hi,
I got a question about optimizing function pointers for direct
function calls in C.

Consider the following scenario: one of the fields of a structure is a
function pointer, and all its assignments come from the same function.
Can all its uses be replaced by direct calls to this function? So the
later passes can do more optimizations.

Here is the example:

int add(int a, int b) { return a + b; }
int sub(int a, int b) { return a - b; }

struct Foo {
int (*add)(int, int);
};
int main()
{
struct Foo foo[5] = malloc(sizeof(struct Foo) * 5);

for (int i = 0; i < 5; i++) {
foo[i].add = add;
}

int sum = 0;
for (int i = 0; i < 5; i++) {
sum += foo[i].add(1, 2);
}

return 0;
}

Can I replace the code above to the code below?

int add(int a, int b) { return a + b; }
int sub(int a, int b) { return a - b; }

struct Foo {
int (*add)(int, int);
};
int main()
{
struct Foo foo[5] = malloc(sizeof(struct Foo) * 5);

for (int i = 0; i < 5; i++) {
foo[i].add = add;
}

int sum = 0;
for (int i = 0; i < 5; i++) {
sum += add(1,2);
}

return 0;
}

My idea is to determine whether the assignment of the field is the
same function, and if so, perform the replacement.

Of course this is not a reasonable optimization, I just want to know
if there are security issues in doing so, and if I want to do it in
the IPA stage, is it possible?

Thanks
Hanke Zhang