Hi, first thanks for the heads-up. Pretty interesting finding.
2013/6/15 Óscar Fuentes <[email protected]>: > This issue was raised on the sf forum: > > http://sourceforge.net/p/mingw-w64/discussion/723797/thread/cd8a855f > > but I very much prefer to use the mailing list. > > The problem: marking a class as dllexport prevents the inlining of its > class methods. I guess the same applies to inline functions. This has a > huge performance impact on C++ code that uses tiny methods (i.e. the > setter/getter idiom). I've experienced slowdowns of 4x on real world > cases. > > The OP filed a bug report on GCC: > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56892 Yes, I was working on this subject too. But from a different perspective, as I found that an inline-function with a dllexport should be only exported if option -fkeep-inline-dllexport was set. And here we seem to have an bug, too. But yes, those functions weren't inlined at all. Even worse - and that seems to be an issue for none-mingw-targets too - is that a function with an explicit calling-convention isn't inlined anymore. I worked a patch to i386.c about the function_inlinable_p HOOK and test a function returning always true. As in general i386 doesn't have machine-attributes which would prevent inlining in general. > I've experimented with using --export-all-symbols, but it creates > duplicate definitions errors when there are several DLLs involved, some > of them depending on the others and on the same C++ static libraries. Yes, export-al-symbols is a pretty bad option due it does for sure something different as OP expects. it is better to prominent mark symbols via dllexport to be exported. Due pretty nice working psuedo-relocation-code in our toolchain is the dllimport not absolute necessary. Nevertheless it doesn't hurt to use it right. > So, in desperation, I dived into GCC sources and hacked the patch below. > Obviously, it is not the Right Way of fixing GCC but it "works" and > allows me to start considering mingw32-w64 as a substitute for MSVC. > Hopefully it is useful to others as a temporary band-aid. For the same > project that was 4x slower than when compiled with MSVC, suddenly it > becomes clearly faster than the MSVC version when DW2 exceptions are > used (using DW2 is cheating on my book, but that's another story.) > > Modified gcc/integrate.c > diff --git a/gcc/integrate.c b/gcc/integrate.c > index 3a79183..f6c9e93 100644 > --- a/gcc/integrate.c > +++ b/gcc/integrate.c > @@ -77,6 +77,9 @@ function_attribute_inlinable_p (const_tree fndecl) > { > const_tree a; > > + if (lookup_attribute ("dllexport", DECL_ATTRIBUTES (fndecl))) > + return true; > + > for (a = DECL_ATTRIBUTES (fndecl); a; a = TREE_CHAIN (a)) > { > const_tree name = TREE_PURPOSE (a); > > > (Caveat: yesterday I looked at GCC sources for first time, so saying > that I'm a total beginner would be some sort of aggrandizement.) Not bad for beginner, but place is indeed bad. As you checking for a target-specific attribute in an target-agnostic place. > On that function, I was unable to clearly picture what > > targetm.function_attribute_inlinable_p > > really aims to. In my experiments it always returns false (it > corresponds hook_bool_const_tree_false) which means that the fndecl is > not inlineable if it has any attribute. Moreover, I don't understand why > it loops through the attributes for deciding the result on a single call > whose arguments do not depend on anything inside the loop, except for > the decision of calling function_attribute_inlinable_p if fndecl has > some attribute. Well, this target-hook has to check *all* machine attributes. The function_inlinable_p routine just checks if at least one attribute occures and then call this hook. Not sure why the hook doesn't take directly the attribute-name to verify, or why not directly the target-hook is called (and this hook checks itself iff one attribute of interest was set), but I assume that thinking here is some weird form of thinking to optimize call ... anyway The patch I am testing right now is: Index: i386.c =================================================================== --- i386.c (Revision 200128) +++ i386.c (Arbeitskopie) @@ -38721,6 +38721,12 @@ static const struct attribute_spec ix86_attribute_ { NULL, 0, 0, false, false, false, NULL, false } }; +static bool +ix86_function_attribute_inlinable_p (const_tree fndecl ATTRIBUTE_UNUSED) +{ + return true; +} + /* Implement targetm.vectorize.builtin_vectorization_cost. */ static int ix86_builtin_vectorization_cost (enum vect_cost_for_stmt type_of_cost, @@ -42700,6 +42706,8 @@ ix86_memmodel_check (unsigned HOST_WIDE_INT val) #undef TARGET_ATTRIBUTE_TABLE #define TARGET_ATTRIBUTE_TABLE ix86_attribute_table +#undef TARGET_FUNCTION_ATTRIBUTE_INLINABLE_P +#define TARGET_FUNCTION_ATTRIBUTE_INLINABLE_P ix86_function_attribute_inlinable_p #if TARGET_DLLIMPORT_DECL_ATTRIBUTES # undef TARGET_MERGE_DECL_ATTRIBUTES # define TARGET_MERGE_DECL_ATTRIBUTES merge_dllimport_decl_attributes > Wouldn't it be simpler to call targetm.function_attribute_inlinable_p > without all the looping? May ... Regards, Kai ------------------------------------------------------------------------------ This SF.net email is sponsored by Windows: Build for Windows Store. http://p.sf.net/sfu/windows-dev2dev _______________________________________________ Mingw-w64-public mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
