Re: Plan for removing global state from GCC's internals
On Mon, 2013-07-01 at 19:43 +, Joseph S. Myers wrote: On Mon, 1 Jul 2013, David Malcolm wrote: [...] Would you be in favor killing off these macros: #define input_line LOCATION_LINE (input_location) #define input_filename LOCATION_FILE (input_location) #define in_system_header (in_system_header_at (input_location)) with patches that make the usage of input_location explicit? (by replacing all uses of these macros with their expansions, cleaning up line-wraps as needed). Yes. The only other macro that implicitly uses input_location is EXPR_LOC_OR_HERE; that could be removed in favor of: EXPR_LOC_OR_LOC(expr, input_location) again making input_location explicit. (I suspect then eliminating the input_location from this - ensuring all expressions have meaningful locations so EXPR_LOC_OR_LOC isn't needed at all - will depend on Andrew MacLeod's proposal. It doesn't explicitly mention this, but one thing that would be desirable as part of making front ends generate internal representation closer to the source would be explicitly representing locations for constants, and for references to declarations within expressions, so that everywhere that wants a location for an expression can reliably extract one from it rather than finding there is no location because certain expressions are shared.) Thanks. I've posted a patch for review that removes these macros: http://gcc.gnu.org/ml/gcc-patches/2013-07/msg00072.html
Re: Plan for removing global state from GCC's internals
Aaron: have you done the patch submission paperwork with the FSF? (as per http://gcc.gnu.org/contribute.html#legal ) If so, is your work available somewhere? Thanks Dave On Mon, 2013-07-01 at 23:56 +0100, Aaron Gray wrote: I started to do this starting with the C++ parser class'izing it but no one was interested. On 1 July 2013 20:43, Joseph S. Myers jos...@codesourcery.com wrote: On Mon, 1 Jul 2013, David Malcolm wrote: As for accessing globals directly versus via APIs: yes, I suppose you do still have an access to a global class instance in each place you formerly had a global variable (that's now a member of that class), so by itself such a conversion to a better API doesn't reduce the number of global variable accesses, just improves the interface in other ways - and it's the changes to pass a pointer to an instance around that reduce the global state usage. In the case of dump files, pass-local state may be a better place than the universe to keep the instance - it is after all passes.c that calls dump_start / dump_finish. So a pass instance should have its own dump_flags, and various dump methods? Perhaps, but as before, I'd prefer to fix the state issue Yes (or rather, the pass instance should contain an instance of the dumper class, which in turn has dump_flags and dump_file members) - as far as I can tell, the lifetime of dump_file and dump_flags is already basically per-pass rather than global. Would you be in favor killing off these macros: #define input_line LOCATION_LINE (input_location) #define input_filename LOCATION_FILE (input_location) #define in_system_header (in_system_header_at (input_location)) with patches that make the usage of input_location explicit? (by replacing all uses of these macros with their expansions, cleaning up line-wraps as needed). Yes. The only other macro that implicitly uses input_location is EXPR_LOC_OR_HERE; that could be removed in favor of: EXPR_LOC_OR_LOC(expr, input_location) again making input_location explicit. (I suspect then eliminating the input_location from this - ensuring all expressions have meaningful locations so EXPR_LOC_OR_LOC isn't needed at all - will depend on Andrew MacLeod's proposal. It doesn't explicitly mention this, but one thing that would be desirable as part of making front ends generate internal representation closer to the source would be explicitly representing locations for constants, and for references to declarations within expressions, so that everywhere that wants a location for an expression can reliably extract one from it rather than finding there is no location because certain expressions are shared.)
Re: Plan for removing global state from GCC's internals
On Mon, 1 Jul 2013, Aaron Gray wrote: I started to do this starting with the C++ parser class'izing it but no one was interested. The C++ parser types such as cp_parser and cp_lexer already do a good job of avoiding global state. I am not an expert on good C++ coding practices and don't know to what extent the objections given in http://gcc.gnu.org/ml/gcc-patches/2012-08/msg02019.html might apply to any parts of David's proposal. But in David's proposal, conversion to classes is a means to an end - eliminating global state through passing implicit this pointers - and not an end in itself if the global state does not exist, or if it can easily be moved inside an existing structure such as cp_parser or gcc_options rather than needing to go inside some new class. (I would add that the optimizations for singletons are also a means to an end. In cases where we might reasonably expect multiple instances of something inside a universe, we should be perfectly willing not to use those optimizations and always to pass pointers around, whether explicitly or implicitly. And we should expect that as interfaces get cleaned up in future and are adapted to different ways people may wish to use GCC in a library, cases that start off by using the singleton optimizations are likely to have them removed.) -- Joseph S. Myers jos...@codesourcery.com
Re: Plan for removing global state from GCC's internals
On Tue, Jul 2, 2013 at 3:25 PM, Joseph S. Myers jos...@codesourcery.com wrote: On Mon, 1 Jul 2013, Aaron Gray wrote: I started to do this starting with the C++ parser class'izing it but no one was interested. The C++ parser types such as cp_parser and cp_lexer already do a good job of avoiding global state. The assertion that nobody was interested is, of course, untrue. I did give feedback; but I never heard back after that. I am not an expert on good C++ coding practices and don't know to what extent the objections given in http://gcc.gnu.org/ml/gcc-patches/2012-08/msg02019.html might apply to any parts of David's proposal. But in David's proposal, conversion to classes is a means to an end - eliminating global state through passing implicit this pointers - and not an end in itself if the global state does not exist, or if it can easily be moved inside an existing structure such as cp_parser or gcc_options rather than needing to go inside some new class. Amen. -- Gaby
Re: Plan for removing global state from GCC's internals
On Thu, 2013-06-27 at 20:23 +, Joseph S. Myers wrote: On Thu, 27 Jun 2013, David Malcolm wrote: I want to focus on removal of global state, and I want that to be separate from cleanups of internal APIs. There are several interpretations of the word global in this conversation, and I think I was unclear what I meant; sorry about that. The word global can refer both to visibility, and to lifetime. I'm interested in *lifetime*: when do variables get written to? Where does the value of a variable get used? For example, consider the three variable declarations in tracer.c: static int probability_cutoff; static int branch_ratio_cutoff; sbitmap bb_seen; It turns out that bb_seen is only used in that file, so it can be made static there. With that, these variables have file-local visibility, but currently have global lifetime: they live in the .bss section of the built code, and further study of the code would be needed before a new reader (be they human or a compiler) can say when these variables change state, and where their state is used, beyond just saying sometime during the lifetime of the process. As it happens, these variables follow a very common read/write pattern: they are initialized near the start of the execute hook of a pass, and cleaned up at the end of the hook (in this case, within the tail_duplicate function called within the tracer execute hook of pass_tracer). Also, none of the variables are GTY-marked. This is one of the common state-management patterns in GCC's passes: http://dmalcolm.fedorapeople.org/gcc/global-state/pass-patterns.html#per-invocation-state-with-no-gty-markings The plan there gives a way of moving this state to the stack for the shared-library case (to allow thread-safe usage), whilst keeping it in the .bss section for the traditional build case (for maximum performance, or, at least, consistent performance), with relatively little patching. By contrast, consider this declaration from tree-ssa.c: static struct pointer_map_t *edge_var_maps; Although this is marked as static, it's used in the implementation of an internal API redirect_edge_var_map_* used in 5 other source files. So although this has file-local visibility, the lifetime of the underlying state is considerably more complicated. Whereas I'm thinking of global state as being a symptom of a problem - messy interfaces that have accreted over time - rather than the problem in itself. And moving things into universe allows a proof-of-concept of a shared library build (much like Joern's multi-target patches with namespaces three years ago provided a proof-of-concept of a multi-target build) without really addressing the real problem (basically, I think of state in universe as effectively being global state, and moving state into something passed down to the places needing it - only the relevant bits of state, not a whole universe pointer if there's a smaller logical unit - rather than just accessed through TLS, as being the point where global state is *really* eliminated). From a lifetime meaning of global, the universe ceases to be global state in a shared-library build: there are zero or more parallel universes within one process, all independent of each other; such parallel universes can be created and destroyed by client code. You raise a concern about restricting where state can be used: the universe object will indeed contain a grab-bag of pointers to various other objects, and so it's possible to write code that pokes at one aspect of state from another unrelated aspect of state, using the universe object as a nexus. I wrote about this in: http://dmalcolm.fedorapeople.org/gcc/global-state/plan.html#parallel-universes-vs-modularity where my view is that for an initial iteration of this work we *need* to have such a nexus: we have a spaghetti of interactions already; I'm merely trying to support having multiple, independent plates of spaghetti, if you will, prior to distentangling. I think Andrew MacLeod's proposal is really the answer here for these concerns, and I see our proposals as compatible. There are various places in my plan where I use classes to restrict access: for example, I have a class frontend and class backend; presumably stuff could be placed there in an effort to hide things (either a ravioli or lasagna model, if that's not stretching the metaphor too far: encapsulation and layering respectively). Now, the bulk conversion to universes seems a lot more maintainable than Joern's multi-target patches, and a lot more plausibly an incremental step to a proper fix, and so a lot more reasonable to go in as an incremental step, but I'd still think of it as one of the infamous partial transitions in the absence of a reason to believe, for each formerly-global object being accessed via the universe (or some other piece of context), that it's being accessed via the right piece of context being passed down to functions that need
Re: Plan for removing global state from GCC's internals
On Mon, 1 Jul 2013, David Malcolm wrote: As for accessing globals directly versus via APIs: yes, I suppose you do still have an access to a global class instance in each place you formerly had a global variable (that's now a member of that class), so by itself such a conversion to a better API doesn't reduce the number of global variable accesses, just improves the interface in other ways - and it's the changes to pass a pointer to an instance around that reduce the global state usage. In the case of dump files, pass-local state may be a better place than the universe to keep the instance - it is after all passes.c that calls dump_start / dump_finish. So a pass instance should have its own dump_flags, and various dump methods? Perhaps, but as before, I'd prefer to fix the state issue Yes (or rather, the pass instance should contain an instance of the dumper class, which in turn has dump_flags and dump_file members) - as far as I can tell, the lifetime of dump_file and dump_flags is already basically per-pass rather than global. Would you be in favor killing off these macros: #define input_line LOCATION_LINE (input_location) #define input_filename LOCATION_FILE (input_location) #define in_system_header (in_system_header_at (input_location)) with patches that make the usage of input_location explicit? (by replacing all uses of these macros with their expansions, cleaning up line-wraps as needed). Yes. The only other macro that implicitly uses input_location is EXPR_LOC_OR_HERE; that could be removed in favor of: EXPR_LOC_OR_LOC(expr, input_location) again making input_location explicit. (I suspect then eliminating the input_location from this - ensuring all expressions have meaningful locations so EXPR_LOC_OR_LOC isn't needed at all - will depend on Andrew MacLeod's proposal. It doesn't explicitly mention this, but one thing that would be desirable as part of making front ends generate internal representation closer to the source would be explicitly representing locations for constants, and for references to declarations within expressions, so that everywhere that wants a location for an expression can reliably extract one from it rather than finding there is no location because certain expressions are shared.) -- Joseph S. Myers jos...@codesourcery.com
Re: Plan for removing global state from GCC's internals
I started to do this starting with the C++ parser class'izing it but no one was interested. On 1 July 2013 20:43, Joseph S. Myers jos...@codesourcery.com wrote: On Mon, 1 Jul 2013, David Malcolm wrote: As for accessing globals directly versus via APIs: yes, I suppose you do still have an access to a global class instance in each place you formerly had a global variable (that's now a member of that class), so by itself such a conversion to a better API doesn't reduce the number of global variable accesses, just improves the interface in other ways - and it's the changes to pass a pointer to an instance around that reduce the global state usage. In the case of dump files, pass-local state may be a better place than the universe to keep the instance - it is after all passes.c that calls dump_start / dump_finish. So a pass instance should have its own dump_flags, and various dump methods? Perhaps, but as before, I'd prefer to fix the state issue Yes (or rather, the pass instance should contain an instance of the dumper class, which in turn has dump_flags and dump_file members) - as far as I can tell, the lifetime of dump_file and dump_flags is already basically per-pass rather than global. Would you be in favor killing off these macros: #define input_line LOCATION_LINE (input_location) #define input_filename LOCATION_FILE (input_location) #define in_system_header (in_system_header_at (input_location)) with patches that make the usage of input_location explicit? (by replacing all uses of these macros with their expansions, cleaning up line-wraps as needed). Yes. The only other macro that implicitly uses input_location is EXPR_LOC_OR_HERE; that could be removed in favor of: EXPR_LOC_OR_LOC(expr, input_location) again making input_location explicit. (I suspect then eliminating the input_location from this - ensuring all expressions have meaningful locations so EXPR_LOC_OR_LOC isn't needed at all - will depend on Andrew MacLeod's proposal. It doesn't explicitly mention this, but one thing that would be desirable as part of making front ends generate internal representation closer to the source would be explicitly representing locations for constants, and for references to declarations within expressions, so that everywhere that wants a location for an expression can reliably extract one from it rather than finding there is no location because certain expressions are shared.) -- Joseph S. Myers jos...@codesourcery.com
Re: Plan for removing global state from GCC's internals
On Wed, 26 Jun 2013, David Malcolm wrote: FWIW I wonder to what extent the discussions that follow all exhibit a tradeoff between the desire to provide a clean API vs the desire to minimize the size of the patch (to reduce backporting pain). I don't think reducing backporting pain is particularly relevant. There are lots of things to clean up; the question is priorities and ordering rather than whether some particular cleanup should be avoided because of backporting pain it causes. * For dump_file and associated variables such as dump_flags, I sort of think there should be a proper API for code writing to dumps rather than directly accessing dump_file all over the compiler. That should massively reduce the number of places needing to access those fields of the universe. Right, but doing this would be a big patch, touching numerous files. You said you had refactoring tools to make such things easier Given such tools, I'm imagining the large but mechanical changes shouldn't be so much more work than smaller mechanical changes for the same issue. What would a new API look like? The classic problem with logging/dumping is that you want to avoid doing work in the no-logging case, so that rather than: log (some message: %s, perform_some_expensive_computation ()); with enable/disable in log (and thus always doing the expensive work and typically discarding it), you have things like: if (logging) { log (some message: %s, perform_some_expensive_computation ()); } If we're going to have a logging conditional there, that's going to involve passing around or looking up a variable, so why not simply make it the dump_file? That's a reason for the API to separate whether to generate data for dumps from the actual dumping. But I think the API for whether to dump should (if we think about what a good design would be, rather than what a minimal patch would be) be something that returns a boolean, not a FILE *. The fact that a FILE * is involved internally isn't something most of the code all over the compiler generating data for dumps should need to care about. Here's one motivation for wanting better APIs both for this and for .s output. Some places in the compiler use printf formats such as HOST_WIDE_INT_PRINT_DEC. Others use GCC pretty-printer formats such as %wd (the pretty-printer implementation internally uses HOST_WIDE_INT_PRINT_DEC to implement %wd). People need to remember what to use where, and if they use the wrong format in the wrong place they may get host-dependent bugs. Generally it's a bad idea for host-portability details such as HOST_WIDE_INT_PRINT_DEC to be used all over the place - it's better if they can be used only in a few places implementing formats such as %wd, with everything else using the format %wd. If you have an API such as dump_file_fprintf, you can do that. If code is using fprintf directly on dump_file, you can't. Now, creating interfaces such as dump_file_fprintf doesn't itself clean up this issue - a separate patch would be needed that makes dump_file_fprintf use the pretty-printer code and makes all formats used with dump_file_fprintf use the pretty-printer formats and only those formats (and not any printf features not supported by the pretty-printer code). But it's a useful starting point to facilitate such a cleanup. And changing fprintf (dump_file, ...) to dump_file_fprintf (...) should be a straightforward change to make, given a refactoring tool that will handle reindenting subsequent lines of arguments to the call to match the new column of the open-parenthesis (and wrapping if that would make lines too long). It might be a larger patch than your suggestion of setting a dump_file local variable from the universe at the start of each function, but I think it's a better interface. FWIW I originally came up with a very simple API, but I filed it in the rejected ideas appendix: http://dmalcolm.fedorapeople.org/gcc/global-state/rejected-ideas.html#rejected-idea-dump-if-details That rejected idea involves returning a FILE *, which I don't like if the API for dumping is being rethought. * For asm_out_file, see dump_file, stdout and stderr above - there should be a well-defined API for writing assembler output and only the implementation of that API should refer directly to this global. In the plan I suggested using TLS for this in the shared-library build, out of a desire to minimize the patch size. An alternative approach would be to move much of output.h into a class; something like this: class MAYBE_SINGLETON asm_out { public: void putc (char ); void puts (const char *); void printf (const char *, ...) some_attribute; /* Most out output.h gets moved to here. */ void assemble_zeros (unsigned HOST_WIDE_INT); void assemble_align (int); void assemble_string (const char *, int); void assemble_external_libcall
Re: Plan for removing global state from GCC's internals
On Thu, 2013-06-27 at 14:50 +, Joseph S. Myers wrote: On Wed, 26 Jun 2013, David Malcolm wrote: FWIW I wonder to what extent the discussions that follow all exhibit a tradeoff between the desire to provide a clean API vs the desire to minimize the size of the patch (to reduce backporting pain). I don't think reducing backporting pain is particularly relevant. There are lots of things to clean up; the question is priorities and ordering rather than whether some particular cleanup should be avoided because of backporting pain it causes. * For dump_file and associated variables such as dump_flags, I sort of think there should be a proper API for code writing to dumps rather than directly accessing dump_file all over the compiler. That should massively reduce the number of places needing to access those fields of the universe. Right, but doing this would be a big patch, touching numerous files. You said you had refactoring tools to make such things easier Given such tools, I'm imagining the large but mechanical changes shouldn't be so much more work than smaller mechanical changes for the same issue. I'm wary of scope-creep here. I want to focus on removal of global state, and I want that to be separate from cleanups of internal APIs. IMHO the kinds of change you're proposing *would* be much more work. Although I have some tools to help with big refactorings and they do save some time, it's mainly a way to stop my RSI from flaring up (by generating ChangeLogs, and allowing for easy tweaking based on patch reviews). It's still a fair amount of work to write the refactoring script, deal with the cases where it doesn't work, and ensure that whitespace is properly handled. Compared to the above, options like: #define dump_file GET_UNIVERSE().dump_file_ #define dump_flags GET_UNIVERSE().dump_flags_ or: #if SHARED_LIBRARY #define MAYBE_TLS __thread #else #define MAYBE_TLS #endif extern MAYBE_TLS FILE *dump_file; extern MAYBE_TLS FILE *dump_flags; are trivial. So I'd much prefer to keep the globals-removal separate from the API cleanup, and not have the latter be a prerequisite for the former; we can go back later and do API cleanups. What would a new API look like? The classic problem with logging/dumping is that you want to avoid doing work in the no-logging case, so that rather than: log (some message: %s, perform_some_expensive_computation ()); with enable/disable in log (and thus always doing the expensive work and typically discarding it), you have things like: if (logging) { log (some message: %s, perform_some_expensive_computation ()); } If we're going to have a logging conditional there, that's going to involve passing around or looking up a variable, so why not simply make it the dump_file? That's a reason for the API to separate whether to generate data for dumps from the actual dumping. But I think the API for whether to dump should (if we think about what a good design would be, rather than what a minimal patch would be) be something that returns a boolean, not a FILE *. The fact that a FILE * is involved internally isn't something most of the code all over the compiler generating data for dumps should need to care about. Here's one motivation for wanting better APIs both for this and for .s output. Some places in the compiler use printf formats such as HOST_WIDE_INT_PRINT_DEC. Others use GCC pretty-printer formats such as %wd (the pretty-printer implementation internally uses HOST_WIDE_INT_PRINT_DEC to implement %wd). People need to remember what to use where, and if they use the wrong format in the wrong place they may get host-dependent bugs. Generally it's a bad idea for host-portability details such as HOST_WIDE_INT_PRINT_DEC to be used all over the place - it's better if they can be used only in a few places implementing formats such as %wd, with everything else using the format %wd. If you have an API such as dump_file_fprintf, you can do that. If code is using fprintf directly on dump_file, you can't. Now, creating interfaces such as dump_file_fprintf doesn't itself clean up this issue - a separate patch would be needed that makes dump_file_fprintf use the pretty-printer code and makes all formats used with dump_file_fprintf use the pretty-printer formats and only those formats (and not any printf features not supported by the pretty-printer code). But it's a useful starting point to facilitate such a cleanup. And changing fprintf (dump_file, ...) to dump_file_fprintf (...) should be a straightforward change to make, given a refactoring tool that will handle reindenting subsequent lines of arguments to the call to match the new column of the open-parenthesis (and wrapping if that would make lines too long). It might be a larger patch than your suggestion of setting a
Re: Plan for removing global state from GCC's internals
On Thu, 27 Jun 2013, David Malcolm wrote: I want to focus on removal of global state, and I want that to be separate from cleanups of internal APIs. Whereas I'm thinking of global state as being a symptom of a problem - messy interfaces that have accreted over time - rather than the problem in itself. And moving things into universe allows a proof-of-concept of a shared library build (much like Joern's multi-target patches with namespaces three years ago provided a proof-of-concept of a multi-target build) without really addressing the real problem (basically, I think of state in universe as effectively being global state, and moving state into something passed down to the places needing it - only the relevant bits of state, not a whole universe pointer if there's a smaller logical unit - rather than just accessed through TLS, as being the point where global state is *really* eliminated). Now, the bulk conversion to universes seems a lot more maintainable than Joern's multi-target patches, and a lot more plausibly an incremental step to a proper fix, and so a lot more reasonable to go in as an incremental step, but I'd still think of it as one of the infamous partial transitions in the absence of a reason to believe, for each formerly-global object being accessed via the universe (or some other piece of context), that it's being accessed via the right piece of context being passed down to functions that need it, rather than from the global universe for something that doesn't need to be. As for accessing globals directly versus via APIs: yes, I suppose you do still have an access to a global class instance in each place you formerly had a global variable (that's now a member of that class), so by itself such a conversion to a better API doesn't reduce the number of global variable accesses, just improves the interface in other ways - and it's the changes to pass a pointer to an instance around that reduce the global state usage. In the case of dump files, pass-local state may be a better place than the universe to keep the instance - it is after all passes.c that calls dump_start / dump_finish. (Indeed, it would arguably be an improvement, in cases where an API uses global state completely implicitly as opposed to via a class instance that contains that state, to pull out that global state to the call sites, reflecting that eventually those call sites should be passing in proper non-global values. For example, converting those diagnostic function uses that implicitly use input_location to use it explicitly instead - each should eventually move to an explicit location that isn't input_location. Implicit use of global_dc matters rather less and it seems reasonable enough for that context to continue to come from the universe in many cases. Cf. how a couple of years ago I made a lot more option-handling code pass around struct gcc_options *, location_t and diagnostic_context * values, rather than directly using the globals - although at some point, a global value tends to get passed into these functions.) (Similarly, for target macro to hook conversions (a) I consider a function in targhooks.c that uses the target macro just to be an interim step and a macro only really to be addressed once all the definitions in different targets have been converted to define functions and the macro has been poisoned, and (b) for each such conversion we at least consider what the hook ought to look like rather than presuming a direct conversion of the macro semantics, one macro to one hook, is appropriate.) -- Joseph S. Myers jos...@codesourcery.com
Plan for removing global state from GCC's internals
I've been looking at removing global state from GCC with a view to making it be usable as a shared library. I've been posting various patches relating to this, but I thought it was time to post a comprehensive plan so you can see how I think it all ought to fit together. You can see an HTML version of my proposal at: http://dmalcolm.fedorapeople.org/gcc/global-state/ and the source for the doc can be seen at: https://github.com/davidmalcolm/gcc-global-state/ (restructured text, using the Sphinx toolchain). I've gone through all of GCC's passes, identifying internal state, and also looked at the most-frequently used global variables in GCC. You can see detailed notes on these in the appendices. It's still a work-in-progress - there are still quite a few TODOs in there, but it seemed time to post to this list. A single-paragraph summary is that I want to move global variables and functions into classes: these classes will be singletons in the normal build, and will have multiple instances in a shared library build, allowing for multiple parallel universes of state within one GCC process. There are various tricks to (a) maintain the performance of the standard monolithic binaries use case and (b) minimize the patching and backporting pain relative to older GCC source trees. In particular, it introduces a new compiler attribute force_static which gets used in stages 2 and 3 of the bootstrap to eliminate this from methods of the various singleton classes. I hope the plan seems reasonable to the core GCC maintainers, and I'm keen to get moving on this for GCC 4.9. However I appreciate that I'm a relative newcomer to GCC development (albeit the author/maintainer of the gcc python plugin, for the last 2 years). There are various questions e.g. what can go into trunk vs a branch? various naming decisions etc. The trunk vs branch question is the one I'm most keen to resolve. In particular, I'm aware that Andrew MacLeod recently posted another architectural proposal: http://gcc.gnu.org/ml/gcc/2013-06/msg00163.html AIUI his proposal and mine are mostly orthogonal to each other: his makes changes to the insides to tree, whereas mine bundles up global variables and functions into classes. Both proposals involve touching a lot of code but both can largely be done incrementally, and, I hope independently of each other - but I want to avoid painful branch mergers, of course. BTW, I will be at the GNU Tools Cauldron next month. Thoughts? Dave
Re: Plan for removing global state from GCC's internals
FWIW, we also needed to perform multiple invocations of toplev_main from a single execution of GCC frontend, which seems to be quite similar. The dirty dirty hack is to save the backup the content of .data and .bss symbols with ELF API before the first call to toplev_main and reset them to backup values before each subsequent call. And it works. Would be great to get rid of global state in a better way, maybe our approach could be useful for transition period. - D. On 06/26/2013 02:46 PM, David Malcolm wrote: I've been looking at removing global state from GCC with a view to making it be usable as a shared library. I've been posting various patches relating to this, but I thought it was time to post a comprehensive plan so you can see how I think it all ought to fit together. You can see an HTML version of my proposal at: http://dmalcolm.fedorapeople.org/gcc/global-state/ and the source for the doc can be seen at: https://github.com/davidmalcolm/gcc-global-state/ (restructured text, using the Sphinx toolchain). I've gone through all of GCC's passes, identifying internal state, and also looked at the most-frequently used global variables in GCC. You can see detailed notes on these in the appendices. It's still a work-in-progress - there are still quite a few TODOs in there, but it seemed time to post to this list. A single-paragraph summary is that I want to move global variables and functions into classes: these classes will be singletons in the normal build, and will have multiple instances in a shared library build, allowing for multiple parallel universes of state within one GCC process. There are various tricks to (a) maintain the performance of the standard monolithic binaries use case and (b) minimize the patching and backporting pain relative to older GCC source trees. In particular, it introduces a new compiler attribute force_static which gets used in stages 2 and 3 of the bootstrap to eliminate this from methods of the various singleton classes. I hope the plan seems reasonable to the core GCC maintainers, and I'm keen to get moving on this for GCC 4.9. However I appreciate that I'm a relative newcomer to GCC development (albeit the author/maintainer of the gcc python plugin, for the last 2 years). There are various questions e.g. what can go into trunk vs a branch? various naming decisions etc. The trunk vs branch question is the one I'm most keen to resolve. In particular, I'm aware that Andrew MacLeod recently posted another architectural proposal: http://gcc.gnu.org/ml/gcc/2013-06/msg00163.html AIUI his proposal and mine are mostly orthogonal to each other: his makes changes to the insides to tree, whereas mine bundles up global variables and functions into classes. Both proposals involve touching a lot of code but both can largely be done incrementally, and, I hope independently of each other - but I want to avoid painful branch mergers, of course. BTW, I will be at the GNU Tools Cauldron next month. Thoughts? Dave
Re: Plan for removing global state from GCC's internals
GCC is hosted on platforms other than SVR4 ABI and ELF file format. - David On Wed, Jun 26, 2013 at 3:19 PM, Dmitry Mikushin dmi...@kernelgen.org wrote: FWIW, we also needed to perform multiple invocations of toplev_main from a single execution of GCC frontend, which seems to be quite similar. The dirty dirty hack is to save the backup the content of .data and .bss symbols with ELF API before the first call to toplev_main and reset them to backup values before each subsequent call. And it works. Would be great to get rid of global state in a better way, maybe our approach could be useful for transition period. - D. On 06/26/2013 02:46 PM, David Malcolm wrote: I've been looking at removing global state from GCC with a view to making it be usable as a shared library. I've been posting various patches relating to this, but I thought it was time to post a comprehensive plan so you can see how I think it all ought to fit together. You can see an HTML version of my proposal at: http://dmalcolm.fedorapeople.org/gcc/global-state/ and the source for the doc can be seen at: https://github.com/davidmalcolm/gcc-global-state/ (restructured text, using the Sphinx toolchain). I've gone through all of GCC's passes, identifying internal state, and also looked at the most-frequently used global variables in GCC. You can see detailed notes on these in the appendices. It's still a work-in-progress - there are still quite a few TODOs in there, but it seemed time to post to this list. A single-paragraph summary is that I want to move global variables and functions into classes: these classes will be singletons in the normal build, and will have multiple instances in a shared library build, allowing for multiple parallel universes of state within one GCC process. There are various tricks to (a) maintain the performance of the standard monolithic binaries use case and (b) minimize the patching and backporting pain relative to older GCC source trees. In particular, it introduces a new compiler attribute force_static which gets used in stages 2 and 3 of the bootstrap to eliminate this from methods of the various singleton classes. I hope the plan seems reasonable to the core GCC maintainers, and I'm keen to get moving on this for GCC 4.9. However I appreciate that I'm a relative newcomer to GCC development (albeit the author/maintainer of the gcc python plugin, for the last 2 years). There are various questions e.g. what can go into trunk vs a branch? various naming decisions etc. The trunk vs branch question is the one I'm most keen to resolve. In particular, I'm aware that Andrew MacLeod recently posted another architectural proposal: http://gcc.gnu.org/ml/gcc/2013-06/msg00163.html AIUI his proposal and mine are mostly orthogonal to each other: his makes changes to the insides to tree, whereas mine bundles up global variables and functions into classes. Both proposals involve touching a lot of code but both can largely be done incrementally, and, I hope independently of each other - but I want to avoid painful branch mergers, of course. BTW, I will be at the GNU Tools Cauldron next month. Thoughts? Dave
Re: Plan for removing global state from GCC's internals
On Wed, 2013-06-26 at 15:19 -0400, Dmitry Mikushin wrote: FWIW, we also needed to perform multiple invocations of toplev_main from a single execution of GCC frontend, which seems to be quite similar. Interesting. Yes, this sounds very similar to the kinds of use-cases I'm considering. Am I right in thinking you're using GCC (and LLVM) to target GPUs? The dirty dirty hack is to save the backup the content of .data and .bss symbols with ELF API before the first call to toplev_main and reset them to backup values before each subsequent call. And it works. Thanks. I went looking for your code; for reference, is this what you're referring to? https://hpcforge.org/scm/viewvc.php/trunk/patches/gcc.patch?revision=1918root=kernelgenview=markup (i.e. the changes to main?) Would be great to get rid of global state in a better way, maybe our approach could be useful for transition period. The backup of .data and .bss approach allows for repeated calls to toplev_main, but it doesn't allow for multiple threads to be running simultaneously within one process. As you say, it's a dirty dirty hack - I'm glad it works for you, but it seems very fragile: e.g. what happens about GCC plugins and other DSOs linked into the process: presumably they also have state, which isn't going to get handled if you're only blowing away the state of the main executable. I'm interested in cleaning this up properly... for some definition of that word, of course! - D. On 06/26/2013 02:46 PM, David Malcolm wrote: I've been looking at removing global state from GCC with a view to making it be usable as a shared library. I've been posting various patches relating to this, but I thought it was time to post a comprehensive plan so you can see how I think it all ought to fit together. You can see an HTML version of my proposal at: http://dmalcolm.fedorapeople.org/gcc/global-state/ and the source for the doc can be seen at: https://github.com/davidmalcolm/gcc-global-state/ (restructured text, using the Sphinx toolchain). I've gone through all of GCC's passes, identifying internal state, and also looked at the most-frequently used global variables in GCC. You can see detailed notes on these in the appendices. It's still a work-in-progress - there are still quite a few TODOs in there, but it seemed time to post to this list. A single-paragraph summary is that I want to move global variables and functions into classes: these classes will be singletons in the normal build, and will have multiple instances in a shared library build, allowing for multiple parallel universes of state within one GCC process. There are various tricks to (a) maintain the performance of the standard monolithic binaries use case and (b) minimize the patching and backporting pain relative to older GCC source trees. In particular, it introduces a new compiler attribute force_static which gets used in stages 2 and 3 of the bootstrap to eliminate this from methods of the various singleton classes. I hope the plan seems reasonable to the core GCC maintainers, and I'm keen to get moving on this for GCC 4.9. However I appreciate that I'm a relative newcomer to GCC development (albeit the author/maintainer of the gcc python plugin, for the last 2 years). There are various questions e.g. what can go into trunk vs a branch? various naming decisions etc. The trunk vs branch question is the one I'm most keen to resolve. In particular, I'm aware that Andrew MacLeod recently posted another architectural proposal: http://gcc.gnu.org/ml/gcc/2013-06/msg00163.html AIUI his proposal and mine are mostly orthogonal to each other: his makes changes to the insides to tree, whereas mine bundles up global variables and functions into classes. Both proposals involve touching a lot of code but both can largely be done incrementally, and, I hope independently of each other - but I want to avoid painful branch mergers, of course. BTW, I will be at the GNU Tools Cauldron next month. Thoughts? Dave
Re: Plan for removing global state from GCC's internals
Yes, generation of both binary code and LLVM IR in a single GCC invocation. So, first toplev_main goes as usual, and another one - with DragonEgg plugin enabled. LLVM IR ends up as GPU kernels code a bit later. Yes, that is the right patch. Of course, not thread-safe, not generally portable and very fragile. That's why I'm saying FWIW, meaning it might be useful for some internal transitioning during your very useful effort. - D. On 06/26/2013 03:54 PM, David Malcolm wrote: On Wed, 2013-06-26 at 15:19 -0400, Dmitry Mikushin wrote: FWIW, we also needed to perform multiple invocations of toplev_main from a single execution of GCC frontend, which seems to be quite similar. Interesting. Yes, this sounds very similar to the kinds of use-cases I'm considering. Am I right in thinking you're using GCC (and LLVM) to target GPUs? The dirty dirty hack is to save the backup the content of .data and .bss symbols with ELF API before the first call to toplev_main and reset them to backup values before each subsequent call. And it works. Thanks. I went looking for your code; for reference, is this what you're referring to? https://hpcforge.org/scm/viewvc.php/trunk/patches/gcc.patch?revision=1918root=kernelgenview=markup (i.e. the changes to main?) Would be great to get rid of global state in a better way, maybe our approach could be useful for transition period. The backup of .data and .bss approach allows for repeated calls to toplev_main, but it doesn't allow for multiple threads to be running simultaneously within one process. As you say, it's a dirty dirty hack - I'm glad it works for you, but it seems very fragile: e.g. what happens about GCC plugins and other DSOs linked into the process: presumably they also have state, which isn't going to get handled if you're only blowing away the state of the main executable. I'm interested in cleaning this up properly... for some definition of that word, of course! - D. On 06/26/2013 02:46 PM, David Malcolm wrote: I've been looking at removing global state from GCC with a view to making it be usable as a shared library. I've been posting various patches relating to this, but I thought it was time to post a comprehensive plan so you can see how I think it all ought to fit together. You can see an HTML version of my proposal at: http://dmalcolm.fedorapeople.org/gcc/global-state/ and the source for the doc can be seen at: https://github.com/davidmalcolm/gcc-global-state/ (restructured text, using the Sphinx toolchain). I've gone through all of GCC's passes, identifying internal state, and also looked at the most-frequently used global variables in GCC. You can see detailed notes on these in the appendices. It's still a work-in-progress - there are still quite a few TODOs in there, but it seemed time to post to this list. A single-paragraph summary is that I want to move global variables and functions into classes: these classes will be singletons in the normal build, and will have multiple instances in a shared library build, allowing for multiple parallel universes of state within one GCC process. There are various tricks to (a) maintain the performance of the standard monolithic binaries use case and (b) minimize the patching and backporting pain relative to older GCC source trees. In particular, it introduces a new compiler attribute force_static which gets used in stages 2 and 3 of the bootstrap to eliminate this from methods of the various singleton classes. I hope the plan seems reasonable to the core GCC maintainers, and I'm keen to get moving on this for GCC 4.9. However I appreciate that I'm a relative newcomer to GCC development (albeit the author/maintainer of the gcc python plugin, for the last 2 years). There are various questions e.g. what can go into trunk vs a branch? various naming decisions etc. The trunk vs branch question is the one I'm most keen to resolve. In particular, I'm aware that Andrew MacLeod recently posted another architectural proposal: http://gcc.gnu.org/ml/gcc/2013-06/msg00163.html AIUI his proposal and mine are mostly orthogonal to each other: his makes changes to the insides to tree, whereas mine bundles up global variables and functions into classes. Both proposals involve touching a lot of code but both can largely be done incrementally, and, I hope independently of each other - but I want to avoid painful branch mergers, of course. BTW, I will be at the GNU Tools Cauldron next month. Thoughts? Dave
Re: Plan for removing global state from GCC's internals
For a shared library you need a well-defined namespace for GCC functions / variables so it doesn't interfere with users. Are you going to put everything inside a gcc namespace or similar? (You also need to avoid host libraries such as libiberty - which have C interfaces - interfering with users of the shared library. If the initial hosts for shared library builds are ELF, I suppose you can do that with a version map to hide everything not in the gcc namespace.) Some observations specifically on your top-40 globals: * global_options: where available, using a pointer to a gcc_options structure is of course better than using a universe (some attributes could be implemented more cleanly if there were per-function gcc_options pointers, for example). Note also the various TARGET_* macros that are generated to AND a variable with a mask - really, such macros should gain a parameter that is a pointer to a struct gcc_options (and then you'd pass global_options to them if no more specific structure is available). * For dump_file and associated variables such as dump_flags, I sort of think there should be a proper API for code writing to dumps rather than directly accessing dump_file all over the compiler. That should massively reduce the number of places needing to access those fields of the universe. * For targetm, make it const (which would move any further enhancements to it firmly into the domain of Andrew MacLeod's proposal and out of yours, because once it's const it's not global state for a compiler only supporting one target at a time). There are only a few places, for a few targets, that write to targetm at runtime; I think it should be possible to fix those by e.g. changing some data hooks in targetm into function hooks. * For stderr and stdout, really the compiler should only be using them through narrow diagnostic interfaces and not other code using them directly. Put them in global_dc (which would, I suppose, in turn go in the universe)? A shared library user should be able to replace them with streams opened with open_memstream, for example, rather than having diagnostics go direct to stderr/stdout at all. Yes, this requires dealing with implicit uses through functions such as printf as well as explicit references directly in GCC. * For input_location, you no doubt know we want almost everywhere to use the location of some well-defined source-code construct instead of the global (the diagnostic functions that implicitly use input_location should go away completely, for example). But changing that is a lot of work so things probably do need to start by putting it in the universe as you suggest. * For flag_isoc99 and associated variables defined in c-common.c, move them to Variable entries in c.opt (i.e., into global_options). In general, if a variable describes state determined by command-line options, moving it into global_options is probably sensible. * For asm_out_file, see dump_file, stdout and stderr above - there should be a well-defined API for writing assembler output and only the implementation of that API should refer directly to this global. * For lang_hooks, see targetm and make it const. Some hooks are deliberately modified in free_lang_data - the answer to that is, I think, to stop using those hooks directly except via a wrapper that checks whether free_lang_data has been called (one new global) and decides whether to call the langhook based on that. -- Joseph S. Myers jos...@codesourcery.com
Re: Plan for removing global state from GCC's internals
On Wed, 2013-06-26 at 20:21 +, Joseph S. Myers wrote: For a shared library you need a well-defined namespace for GCC functions / variables so it doesn't interfere with users. Are you going to put everything inside a gcc namespace or similar? FWIW I deliberately avoided talking about API/ABIs in the document: removal of global state is IMHO a necessary but not sufficient condition for being able to use GCC as a shared library; I want to focus on the global variables. That said, putting everything in a gcc namespace sounds reasonable to me (apart from classes used solely within a single source file, which can go in anonymous namespaces, right?). (You also need to avoid host libraries such as libiberty - which have C interfaces - interfering with users of the shared library. If the initial hosts for shared library builds are ELF, I suppose you can do that with a version map to hide everything not in the gcc namespace.) (nods) Some observations specifically on your top-40 globals: FWIW I wonder to what extent the discussions that follow all exhibit a tradeoff between the desire to provide a clean API vs the desire to minimize the size of the patch (to reduce backporting pain). I guess I have a vested interest in the latter approach, since a smaller patch is likely to be easier for me to write. However it also helps people backporting to 4.8 and 4.7 * global_options: where available, using a pointer to a gcc_options structure is of course better than using a universe (some attributes could be implemented more cleanly if there were per-function gcc_options pointers, for example). Note also the various TARGET_* macros that are generated to AND a variable with a mask - really, such macros should gain a parameter that is a pointer to a struct gcc_options (and then you'd pass global_options to them if no more specific structure is available). Interesting. * For dump_file and associated variables such as dump_flags, I sort of think there should be a proper API for code writing to dumps rather than directly accessing dump_file all over the compiler. That should massively reduce the number of places needing to access those fields of the universe. Right, but doing this would be a big patch, touching numerous files. At one extreme, a minimal patch (with no new API) would be something like this: #define dump_file GET_UNIVERSE().dump_file_ #define dump_flags GET_UNIVERSE().dump_flags_ The approach currently in my plan document is somewhat more invasive, to try to avoid all of the GET_UNIVERSE() TLS uses in a shared-library build. What would a new API look like? The classic problem with logging/dumping is that you want to avoid doing work in the no-logging case, so that rather than: log (some message: %s, perform_some_expensive_computation ()); with enable/disable in log (and thus always doing the expensive work and typically discarding it), you have things like: if (logging) { log (some message: %s, perform_some_expensive_computation ()); } If we're going to have a logging conditional there, that's going to involve passing around or looking up a variable, so why not simply make it the dump_file? FWIW I originally came up with a very simple API, but I filed it in the rejected ideas appendix: http://dmalcolm.fedorapeople.org/gcc/global-state/rejected-ideas.html#rejected-idea-dump-if-details * For targetm, make it const (which would move any further enhancements to it firmly into the domain of Andrew MacLeod's proposal and out of yours, because once it's const it's not global state for a compiler only supporting one target at a time). There are only a few places, for a few targets, that write to targetm at runtime; I think it should be possible to fix those by e.g. changing some data hooks in targetm into function hooks. Interesting; I'll explore this. * For stderr and stdout, really the compiler should only be using them through narrow diagnostic interfaces and not other code using them directly. Put them in global_dc (which would, I suppose, in turn go in the universe)? A shared library user should be able to replace them with streams opened with open_memstream, for example, rather than having diagnostics go direct to stderr/stdout at all. Yes, this requires dealing with implicit uses through functions such as printf as well as explicit references directly in GCC. Doublechecking, I think many of the reported uses of stdout/stderr are false-positives, where stdout is merely used in gen* tools when building the compiler, which isn't a problem; also in various debug functions intended to be invoked from inside the debugger - again, not an issue. * For input_location, you no doubt know we want almost everywhere to use the location of some well-defined source-code construct instead of the global (the diagnostic functions that implicitly use input_location should go away completely, for