Hi,

first thanks for the heads-up.  Pretty interesting finding.

2013/6/15 Óscar Fuentes <[email protected]>:
> This issue was raised on the sf forum:
>
> http://sourceforge.net/p/mingw-w64/discussion/723797/thread/cd8a855f
>
> but I very much prefer to use the mailing list.
>
> The problem: marking a class as dllexport prevents the inlining of its
> class methods. I guess the same applies to inline functions. This has a
> huge performance impact on C++ code that uses tiny methods (i.e. the
> setter/getter idiom). I've experienced slowdowns of 4x on real world
> cases.
>
> The OP filed a bug report on GCC:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56892

Yes, I was working on this subject too.  But from a different
perspective, as I found that an inline-function with a dllexport
should be only exported if option -fkeep-inline-dllexport was set.
And here we seem to have an bug, too.
But yes, those functions weren't inlined at all.  Even worse - and
that seems to be an issue for none-mingw-targets too - is that a
function with an explicit calling-convention isn't inlined anymore.
I worked a patch to i386.c about the function_inlinable_p HOOK and
test a function returning always true.  As in general i386 doesn't
have machine-attributes which would prevent inlining in general.

> I've experimented with using --export-all-symbols, but it creates
> duplicate definitions errors when there are several DLLs involved, some
> of them depending on the others and on the same C++ static libraries.

Yes, export-al-symbols is a pretty bad option due it does for sure
something different as OP expects.  it is better to prominent mark
symbols via dllexport to be exported.  Due pretty nice working
psuedo-relocation-code in our toolchain is the dllimport not absolute
necessary.  Nevertheless it doesn't hurt to use it right.

> So, in desperation, I dived into GCC sources and hacked the patch below.
> Obviously, it is not the Right Way of fixing GCC but it "works" and
> allows me to start considering mingw32-w64 as a substitute for MSVC.
> Hopefully it is useful to others as a temporary band-aid. For the same
> project that was 4x slower than when compiled with MSVC, suddenly it
> becomes clearly faster than the MSVC version when DW2 exceptions are
> used (using DW2 is cheating on my book, but that's another story.)
>
>         Modified   gcc/integrate.c
> diff --git a/gcc/integrate.c b/gcc/integrate.c
> index 3a79183..f6c9e93 100644
> --- a/gcc/integrate.c
> +++ b/gcc/integrate.c
> @@ -77,6 +77,9 @@ function_attribute_inlinable_p (const_tree fndecl)
>      {
>        const_tree a;
>
> +      if (lookup_attribute ("dllexport", DECL_ATTRIBUTES (fndecl)))
> +        return true;
> +
>        for (a = DECL_ATTRIBUTES (fndecl); a; a = TREE_CHAIN (a))
>         {
>           const_tree name = TREE_PURPOSE (a);
>
>
> (Caveat: yesterday I looked at GCC sources for first time, so saying
> that I'm a total beginner would be some sort of aggrandizement.)

Not bad for beginner, but place is indeed bad.  As you checking for a
target-specific attribute in an target-agnostic place.

> On that function, I was unable to clearly picture what
>
> targetm.function_attribute_inlinable_p
>
> really aims to. In my experiments it always returns false (it
> corresponds hook_bool_const_tree_false) which means that the fndecl is
> not inlineable if it has any attribute. Moreover, I don't understand why
> it loops through the attributes for deciding the result on a single call
> whose arguments do not depend on anything inside the loop, except for
> the decision of calling function_attribute_inlinable_p if fndecl has
> some attribute.

Well, this target-hook has to check *all* machine attributes.  The
function_inlinable_p routine just checks if at least one attribute
occures and then call this hook.  Not sure why the hook doesn't take
directly the attribute-name to verify, or why not directly the
target-hook is called (and this hook checks itself iff one attribute
of interest was set),  but I assume that thinking here is some weird
form of thinking to optimize call ...  anyway

The patch I am testing right now is:

Index: i386.c
===================================================================
--- i386.c      (Revision 200128)
+++ i386.c      (Arbeitskopie)
@@ -38721,6 +38721,12 @@ static const struct attribute_spec ix86_attribute_
   { NULL,        0, 0, false, false, false, NULL, false }
 };

+static bool
+ix86_function_attribute_inlinable_p (const_tree fndecl ATTRIBUTE_UNUSED)
+{
+  return true;
+}
+
 /* Implement targetm.vectorize.builtin_vectorization_cost.  */
 static int
 ix86_builtin_vectorization_cost (enum vect_cost_for_stmt type_of_cost,
@@ -42700,6 +42706,8 @@ ix86_memmodel_check (unsigned HOST_WIDE_INT val)

 #undef TARGET_ATTRIBUTE_TABLE
 #define TARGET_ATTRIBUTE_TABLE ix86_attribute_table
+#undef TARGET_FUNCTION_ATTRIBUTE_INLINABLE_P
+#define TARGET_FUNCTION_ATTRIBUTE_INLINABLE_P
ix86_function_attribute_inlinable_p
 #if TARGET_DLLIMPORT_DECL_ATTRIBUTES
 #  undef TARGET_MERGE_DECL_ATTRIBUTES
 #  define TARGET_MERGE_DECL_ATTRIBUTES merge_dllimport_decl_attributes

> Wouldn't it be simpler to call targetm.function_attribute_inlinable_p
> without all the looping?
May ...

Regards,
Kai

------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
Mingw-w64-public mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

Reply via email to