Re: LTO, LLVM, etc.
Mathieu Lacage wrote: A path where different solutions for different problems are evolved independently and then merged where it makes sense seems better to me than a path where a single solution to two different problems is attempted from the start. Which is thus why I think that there are inherent reasons that you must necessarily have multiple representations. There are a lot of places, in GCC and otherwise, where having a unified framework for things has been a clear advantage. So, I think your statement that genericity is most often bad is too strong; it's bad sometimes, and good other times. You're definitely right that false commonality can lead to bad results; but, on the other hand, a frequent complaint is that people have to write the same code twice because something that could have been shared was not. That's why I think we should be talking about the effort required to implement the approaches before us, and the payoffs from where those approaches lead us, rather than generalities about design. (And, if you really want a prize, you can put risk-adjusted in front of effort and payoffs above!) Thanks, -- Mark Mitchell CodeSourcery, LLC [EMAIL PROTECTED] (916) 791-8304
Re: LTO, LLVM, etc.
Mark Mitchell [EMAIL PROTECTED] writes: There is one advantage I see in the LTO design over LLVM's design. In particular, the LTO proposal envisions a file format that is roughly at the level of GIMPLE. Such a file format could easily be extended to be at the source-level version of Tree used in the front-ends, so that object files could contain two extra sections: one for LTO and one for source-level information. The latter section could be used for things like C++ export -- but, more importantly, for other tools that need source-level information, like IDEs, indexers, checkers, etc. (All tools that presently use the EDG front end would be candidate clients for this interface.) It seems to me that this is clearly useful anyhow. And it seems to me that whether or not we use LTO, LLVM, or neither, we will still want something along these lines. So if anybody is inclined to work on this, they could start now. Anything that writes our our high level tree representation (GENERIC plus language specific codes) is going to work straightforwardly for our low level tree representation (GIMPLE). And we are going to want to be able to write out the high level representation no matter what. In short, while this is an important issue, I don't see it as strongly favoring either side. What it means, essentially, is that LTO is not quite as much work as it might otherwise seem to be, because we are going to do some of the work anyhow. So when considering how much work has to be done for LTO compared to how much work has to be done for LLVM, we should take that into account. This is more or less what you said, of course, but I think with a different spin. If we do switch to LLVM, it's not going to happen before at least 4.3, and, if I had to guess, not before 4.4. Allow me to be the first person to say that if we switch to LLVM, the first release which incorporates it as the default compilation path should be called 5.0. Ian
Re: LTO, LLVM, etc.
On Saturday 03 December 2005 20:43, Mark Mitchell wrote: There is one advantage I see in the LTO design over LLVM's design. In particular, the LTO proposal envisions a file format that is roughly at the level of GIMPLE. Such a file format could easily be extended to be at the source-level version of Tree used in the front-ends, so that object files could contain two extra sections: one for LTO and one for source-level information. The latter section could be used for things like C++ export -- but, more importantly, for other tools that need source-level information, like IDEs, indexers, checkers, etc. I actually see this as a disadvantage. IMVHO dumping for export and front-end tools and for the optimizers should not be coupled like this. Iff we decide to dump trees, then I would hope the dumper would dump GIMPLE only, not the full front end and middle-end tree representation. Sharing a tree dumper between the front ends and the middle-end would only make it more difficult again to move to sane data structures for the middle end and to cleaner data structures for the front ends. Gr. Steven
Re: LTO, LLVM, etc.
Steven Bosscher [EMAIL PROTECTED] writes: | On Saturday 03 December 2005 20:43, Mark Mitchell wrote: | There is one advantage I see in the LTO design over LLVM's design. In | particular, the LTO proposal envisions a file format that is roughly at | the level of GIMPLE. Such a file format could easily be extended to be | at the source-level version of Tree used in the front-ends, so that | object files could contain two extra sections: one for LTO and one for | source-level information. The latter section could be used for things | like C++ export -- but, more importantly, for other tools that need | source-level information, like IDEs, indexers, checkers, etc. | | I actually see this as a disadvantage. | | IMVHO dumping for export and front-end tools and for the optimizers | should not be coupled like this. I'm wondering what the reasons are. | Iff we decide to dump trees, then I | would hope the dumper would dump GIMPLE only, not the full front end | and middle-end tree representation. | | Sharing a tree dumper between the front ends and the middle-end would | only make it more difficult again to move to sane data structures for | the middle end and to cleaner data structures for the front ends. Why? -- Gaby
Re: LTO, LLVM, etc.
On Dec 5, 2005, at 11:48 AM, Steven Bosscher wrote: On Saturday 03 December 2005 20:43, Mark Mitchell wrote: There is one advantage I see in the LTO design over LLVM's design. In particular, the LTO proposal envisions a file format that is roughly at the level of GIMPLE. Such a file format could easily be extended to be at the source-level version of Tree used in the front-ends, so that object files could contain two extra sections: one for LTO and one for source-level information. The latter section could be used for things like C++ export -- but, more importantly, for other tools that need source-level information, like IDEs, indexers, checkers, etc. I actually see this as a disadvantage. IMVHO dumping for export and front-end tools and for the optimizers should not be coupled like this. Iff we decide to dump trees, then I would hope the dumper would dump GIMPLE only, not the full front end and middle-end tree representation. Sharing a tree dumper between the front ends and the middle-end would only make it more difficult again to move to sane data structures for the middle end and to cleaner data structures for the front ends. I totally agree with Steven on this one. It is *good* for the representation hosting optimization to be different from the representation you use to represent a program at source level. The two have very different goals and uses, and trying to merge them into one representation will give you a representation that isn't very good for either use. In particular, the optimization representation really does want something in three-address form. The current tree-ssa implementation emulates this (very inefficiently) using trees, but at a significant performance and memory cost. The representation you want for source-level information almost certainly *must* be a tree. I think it is very dangerous to try to artificially tie link-time (and other) optimization together with source-level clients. The costs are great and difficult to recover from (e.g. as difficult as it is to move the current tree-ssa work to a lighter-weight representation) once the path has been started. That said, having a good representation for source-level exporting is clearly useful. To be perfectly clear, I am not against a source- level form, I am just saying that it should be *different* than the one used for optimization. -Chris
Re: LTO, LLVM, etc.
On 12/5/05, Chris Lattner [EMAIL PROTECTED] wrote: That said, having a good representation for source-level exporting is clearly useful. To be perfectly clear, I am not against a source- level form, I am just saying that it should be *different* than the one used for optimization. Debug information describes two things: the source program, and its relationship to the machine code produced by the toolchain. The second is much harder to produce; each pass needs to maintain the relation between the code it produces and the compiler's original input. Keeping the two representations separate (which I could easily see being beneficial for optimization) shifts that burden onto some new party which isn't being discussed, and which will be quite complicated.
Re: LTO, LLVM, etc.
Steven Bosscher wrote: On Saturday 03 December 2005 20:43, Mark Mitchell wrote: There is one advantage I see in the LTO design over LLVM's design. In particular, the LTO proposal envisions a file format that is roughly at the level of GIMPLE. Such a file format could easily be extended to be at the source-level version of Tree used in the front-ends, so that object files could contain two extra sections: one for LTO and one for source-level information. The latter section could be used for things like C++ export -- but, more importantly, for other tools that need source-level information, like IDEs, indexers, checkers, etc. I actually see this as a disadvantage. IMVHO dumping for export and front-end tools and for the optimizers should not be coupled like this. Iff we decide to dump trees, then I would hope the dumper would dump GIMPLE only, not the full front end and middle-end tree representation. You and I have disagreed about this before, and I think we will continue to do so. I don't see anything about Tree that I find inherently awful; in fact, it looks very much like what I see in other front ends. There are aspects I dislike (overuse of pointers, lack of type-safety, unncessary copies of types), but I couldn't possibly justify changing the C++ front-end, for example, to use something entirely other than Tree. That would be a big project, and I don't see much benefit; I think that the things I don't like can be fixed incrementally. (For example, it occurred to me a while back that by fixing the internal type-correctness of expressions, which we want to do anyhow, we could eliminate TREE_TYPE from expression nodes, which would save a pointer.) It's not that I would object to waking up one day to find out that the C++ front-end no longer used Tree, but it just doesn't seem very compelling to me. Sharing a tree dumper between the front ends and the middle-end would only make it more difficult again to move to sane data structures for the middle end and to cleaner data structures for the front ends. The differences between GIMPLE and C++ Trees are small, structurally; there are just a lot of extra nodes in C++ that never reach GIMPLE. If we had a tree dumper for one, we'd get the other one almost for free. So, I don't think sharing the tree dumper stands in the way of anything; you can still switch either part of the compiler to use non-Tree whenever you like. You'll just need a new dumper, which you would have wanted anyhow. -- Mark Mitchell CodeSourcery, LLC [EMAIL PROTECTED] (916) 791-8304
Re: LTO, LLVM, etc.
Chris Lattner wrote: I totally agree with Steven on this one. It is *good* for the representation hosting optimization to be different from the representation you use to represent a program at source level. The two have very different goals and uses, and trying to merge them into one representation will give you a representation that isn't very good for either use. I don't think that's entirely true. One of the nice things about WHIRL, at least in theory, is that the representation is gradually lowered throughout the compiler, but is never abruptly transitioned, as with GCC's Tree-RTL conversion. So, it's easier to reuse code, instead of having a Tree routine and an RTL routine that do the same thing, as we do in several places in GCC. As a concrete example, having a control-flow graph in the front-end is very useful, for optimization purposes, diagnostic purposes, and for plugging in domain-specific optimizers and analyzers. It would be nice to have flow-graph code that could be easily used in both places, without having to make that code representation-independent, using adapters to abstract away the actual representation. That's not to say that I disagree with: In particular, the optimization representation really does want something in three-address form. The current tree-ssa implementation emulates this (very inefficiently) using trees, but at a significant performance and memory cost. The representation you want for source-level information almost certainly *must* be a tree. Instead, it's a long-winded way of saying that I don't agree that there's any inherent benefit to using completely different representations, but that I do agree that one wants the right representation for the job, and that Tree-SSA is not the best representation for optimization. So, if Tree-SSA is not replaced, it will almost certainly need to evolve. -- Mark Mitchell CodeSourcery, LLC [EMAIL PROTECTED] (916) 791-8304
Re: LTO, LLVM, etc.
On Dec 5, 2005, at 5:27 PM, Mark Mitchell wrote: Steven Bosscher wrote: IMVHO dumping for export and front-end tools and for the optimizers should not be coupled like this. Iff we decide to dump trees, then I would hope the dumper would dump GIMPLE only, not the full front end and middle-end tree representation. It's not that I would object to waking up one day to find out that the C++ front-end no longer used Tree, but it just doesn't seem very compelling to me. I agree with you. The 'tree' data structure is conceptually what we want for the front-ends to represent the code. They are quite similar in spirit to many AST representations. Sharing a tree dumper between the front ends and the middle-end would only make it more difficult again to move to sane data structures for the middle end and to cleaner data structures for the front ends. The differences between GIMPLE and C++ Trees are small, structurally; there are just a lot of extra nodes in C++ that never reach GIMPLE. If we had a tree dumper for one, we'd get the other one almost for free. So, I don't think sharing the tree dumper stands in the way of anything; you can still switch either part of the compiler to use non-Tree whenever you like. You'll just need a new dumper, which you would have wanted anyhow. The point that I'm arguing (and I believe Steven agrees with) is that trees make a poor representation for optimization. Their use in tree- ssa has lead to a representation that takes hundreds of bytes and half a dozen separate allocations for each gimple operation. From the efficiency standpoint alone, it doesn't make sense to use trees for optimization. Further, I would point out that it actually HURTS the front-ends to have the optimizers using trees. We are getting very close to the time when there are not enough tree codes to go around, and there is still a great demand for new ones. Many of these tree codes are front-end specific (e.g. BIND_EXPR and various OpenMP nodes) and many of them are backend specific (e.g. the various nodes for the vectorizer). Having the front-end and the back-end using the same enum *will* have a short term cost if the size of the tree enum field needs to be increased. -Chris
Re: LTO, LLVM, etc.
On Dec 5, 2005, at 5:43 PM, Mark Mitchell wrote: Chris Lattner wrote: I totally agree with Steven on this one. It is *good* for the representation hosting optimization to be different from the representation you use to represent a program at source level. The two have very different goals and uses, and trying to merge them into one representation will give you a representation that isn't very good for either use. I don't think that's entirely true. One of the nice things about WHIRL, at least in theory, is that the representation is gradually lowered throughout the compiler, but is never abruptly transitioned, as with GCC's Tree-RTL conversion. So, it's easier to reuse code, instead of having a Tree routine and an RTL routine that do the same thing, as we do in several places in GCC. I understand where you are coming from here, and agree with it. There *is* value to being able to share things. However, there is a cost. I have never heard anything good about WHIRL from a compilation time standpoint: the continuous lowering approach does have its own cost. Further, continuous lowering makes the optimizers more difficult to deal with, as they either need to know what 'form' they are dealing with, and/or can only work on a subset of the particular forms (meaning that they cannot be freely reordered). In particular, the optimization representation really does want something in three-address form. The current tree-ssa implementation emulates this (very inefficiently) using trees, but at a significant performance and memory cost. The representation you want for source-level information almost certainly *must* be a tree. Instead, it's a long-winded way of saying that I don't agree that there's any inherent benefit to using completely different representations, but that I do agree that one wants the right representation for the job, and that Tree-SSA is not the best representation for optimization. So, if Tree-SSA is not replaced, it will almost certainly need to evolve. What sort of form do you think it could/would reasonably take? [1] Why hasn't it already happened? Wouldn't it make more sense to do this work independently of the LTO work, as the LTO work *depends* on an efficient IR and tree-ssa would benefit from it anyway? -Chris 1. I am just not seeing a better way, this is not a rhetorical question!
Re: LTO, LLVM, etc.
Chris Lattner wrote: [Up-front apology: If this thread continues, I may not be able to reply for several days, as I'll be travelling. I know it's not good form to start a discussion and then skip out just when it gets interesting, and I apologize in advance. If I'd been thinking better, I would have waited to send my initial mesasge until I returned.] I understand where you are coming from here, and agree with it. There *is* value to being able to share things. However, there is a cost. I have never heard anything good about WHIRL from a compilation time standpoint: the continuous lowering approach does have its own cost. I haven't heard anything either way, but I take your comment to mean that you have heard that WHIRL is slow, and I'm happy to believe that. I'd agree that a data structure capable of representing more things almost certainly imposes some cost over one capable of representing fewer things! So, yes, there's definitely a cost/benefit tradeoff here. What sort of form do you think it could/would reasonably take? [1] Why hasn't it already happened? Wouldn't it make more sense to do this work independently of the LTO work, as the LTO work *depends* on an efficient IR and tree-ssa would benefit from it anyway? To be clear, I'm really not defending the LTO proposal. I stand by my statement that I don't know enough to have a preference! So, please don't read anything more into what's written here than just the plain words. I did think a little bit about what it would take to make Tree-SSA more efficient. I'm not claiming that they're aren't serious or even fatal flaws in those thoughts; this is just a brain dump. I also don't claim to have measurements showing how much of a difference these changes would make. I'm going to leave TYPE nodes out -- because they're shared with the front-ends, and so will live on anyhow. Similarly for the DECL nodes that correspond to global variables and global functions. So, that leaves EXPR nodes and (perhaps most importantly!) DECLs for local/temporary variables. The first thing to do would be to simplify the local variable DECLs; all we should really need from such a thing is its type (including its alignment, which, despite historical GCC practice is part of its type), its name (for debugging), its location relative to the stack frame (if we want to be able to do optimizations based on the location on the stack, which we may or may not want to do at this point), and whatever mark bits or scratch space are needed by optimization passes. The type and name are shared across all SSA instances of the same variable -- so we could use a pointer to a canonical copy of that information. (For user-visible variables, the canonical copy could be the VAR_DECL from the front end.) So, local variables would collapse 176 bytes (on my system) to something more like 32 bytes. The second thing would be to modify expression nodes. As I mentioned, I'd eliminate their TYPE fields. I'd also eliminate their TREE_COMPLEXITY fields, which are already nearly unused. There's no reason TREE_BLOCK should be needed in most expressions; it's only needed on nodes that correspond to lexical blocks. Those changes would eliminate a significant amount of the current size (64 bytes) for expressions. I also think it ought to be possible to eliminate the source_locus field; instead of putting it on every expression, insert line-notes into the statement stream, at least by the time we reach the optimizers. I'd also eliminate uses of TREE_LIST to link together the nodes in CALL_EXPRs; instead use a vector of operands hanging off the end of the CALL_EXPR corresponding to the number of arguments in the call. Similarly, I'd consider using a vector to store statements in a block, or rather than a linked list. Finally, if you wanted, you could flatten expressions so that each expression was, ala LLVM, an instruction, and all operands were leaves rather than themselves trees; that's a subset of the current tree format. I'm not sure that step would in-and-of-itself save memory, but it would be more optimizer-friendly. In my opinion, the reason this work hasn't been done is that (a) it's not trivial, and (b) there was no sufficiently pressing need. GCC uses a lot of memory, and that's been an issue, but it hasn't been a killer issue in the sense that huge numbers of people who would otherwise have used GCC went somewhere else. Outside of work done by Apple and CodeSourcery, attacking that probably hasn't been (as far as I know?) funded by any companies. You're correct that LTO, were it to proceed, might make this a killer issue, and then we'd have to attack it -- and so that work should go on the cost list for LTO. You're also correct that some of this work would also benefit GCC as a whole, in that the front-ends would use less memory too, and so you're also correct that there is value in doing at least some of the work independently of LTO -- although
Re: LTO, LLVM, etc.
Steven Bosscher wrote: What makes EDG so great is that it represents C++ far closer to the actual source code than G++ does. I know the EDG front-end very well; I first worked with it in 1994, and I have great respect for both the EDG code and the EDG people. I disagree with your use of far closer above; I'd say a bit closer. Good examples of differences are that (before lowering) it has a separate operator for virtual function call (rather than using a virtual function table explicitly) and that pointers-to-member functions are opaque objects, not structures. These are significant differences, but they're not huge differences, or particularly hard to fix in G++. The key strengths of the EDG front-end are its correctness (second to none), cleanliness, excellent documentation, and excellent support. It does what it's supposed to do very well. It would be good for G++ to have a representation that is closer to the source code than what it has now. Yes, closing the gap would be good! I'm a big proponent of introducing a lowering phase into G++. So, while I might disagree about the size of gap, I agree that we should eliminate it. :-) I'd be surprised if there a compiler exists that runs optimizations on EDG's C++ specific representation. I think all compilers that use EDG translate EDG's representation to a more low-level representation. I've worked on several compilers that used the EDG front-end. In all cases, there was eventually translation to different representations, and I agree that you wouldn't want to do all your optimization on EDG IL. However, one compiler I worked on did do a fair amount of optimization on EDG IL, and the KAI inliner also did a lot of optimization (much more than just inlining) on EDG IL. Several of the formats to which I've seen EDG IL translated (WHIRL and a MetaWare internal format, for example) are at about the level of lowered EDG IL (which is basically C with exceptions), which is the form of EDG IL that people use when translating into their internal representation. In some cases, these formats are then again transformed into a lower-level, more RTL-ish format at some point during optimization. I'm not saying that having two different formats is necessarily a bad thing (we've already got Tree and RTL, so we're really talking about two levels or three), or that switching to LLVM is a bad idea, but I don't think there's any inherent reason that we must necessarily have multiple representations. My basic point is that I want to see the decision be made on the basis of the effort required to achieve our goals, not on our opinions about what we think might be the best design in the abstract. In other words, I don't think that the fact that GCC currently uses the same data structures for front-ends and optimizers is in and of itself a problem -- but I'm happy to switch to LLVM, if we think that it's easier to make LLVM do what we want than it is to make Tree-SSA do what we want. -- Mark Mitchell CodeSourcery, LLC [EMAIL PROTECTED] (916) 791-8304
Re: LTO, LLVM, etc.
hi mark, On Mon, 2005-12-05 at 21:33 -0800, Mark Mitchell wrote: I'm not saying that having two different formats is necessarily a bad thing (we've already got Tree and RTL, so we're really talking about two levels or three), or that switching to LLVM is a bad idea, but I don't think there's any inherent reason that we must necessarily have multiple representations. In what I admit is a relatively limited experience (compared to that of you or other gcc contributors) of working with a few large old sucky codebases, I think I have learned one thing: genericity is most often bad. Specifically, I think that trying to re-use the same data structure/algorithms/code for widely different scenarios is what most often leads to large overall complexity and fragility. It seems to me that the advantages of using the LTO representation for frontend-dumping and optimization (code reuse, etc.) are not worth the cost (a single piece of code used for two very different use-cases will necessarily be more complex and thus prone to design bugs). Hubris will lead developers to ignore the latter because they believe they can avoid the complexity trap of code reuse. It might work in the short term because you and others might be able to achieve this feat but I fail to see how you will be able to avoid the inevitable decay of code inherent to this solution in the long run. A path where different solutions for different problems are evolved independently and then merged where it makes sense seems better to me than a path where a single solution to two different problems is attempted from the start. Which is thus why I think that there are inherent reasons that you must necessarily have multiple representations. regards, Mathieu PS: I know I am oversimplifying the problem and your position and I apologize for this. --
LTO, LLVM, etc.
I've been watching the LLVM/LTO discussion with interest. I'm learning that I need to express myself carefully, because people read a lot into what I say, so I've been watching, and talking with lots of people, but not commenting. But, I've gotten a couple of emails asking me what my thoughts are, so here they are! None of what follows is an official position of the FSF, Steering Committee, or even CodeSourcery; it's just my personal thoughts. First and foremost, I'm not an expert on either Tree-SSA or LLVM, so I'm only qualified to comment at a high level. From what I can see, and by all accounts, LLVM is a clean, well-engineered codebase with good capabilities. Assuming that all of the copyright details are worked out, which Chris is actively trying to do, I think we should consider the costs and benefits of replacing Tree-SSA with LLVM. I'm not sure exactly how the costs and benefits stack up, but we'll see. That shouldn't be read as either a favorable or unfavorable comment about switching; I certainly think we should consider LLVM, but I don't have an opinion as to what the outcome of that consideration ought to be. For me, the key consideration is the shape of the compiler-goodness graph vs. time, where goodness includes (in no particular order) optimization capability, cross-platform capability, correctness, backwards compatibility, support for link-time optimization, developer happiness, etc. Like some others have suggested, if it were up to me to pick (which it's not, since I don't control the developer base, steering committee, etc.), I'd make a big list of things we would have to do to LLVM and things we would have to do Tree-SSA, and then decide which one looked easier. The reason the shape of the graph matters to me, rather than just the value at some time t, is that I'm concerned about increasing GCC's overall market share, and market share is sticky, so, ideally, progress is continuous; periods of flatness, or downtrends, are harmful. However, one clearly doesn't want to win in the short term, only to lose big in the long term, so if the one of the LLVM or Tree-SSA lines is significantly higher in the forseeable future that's probably a bigger consideration than the shape of the graph in the short term. If we're opening the door to replacing Tree-SSA, are there any other technologies we should consider? In particular, brushing aside any copyright/patent issues, how would a Tree-WHIRL-RTL widget, using the Open64 technology, stack up relative to Tree-SSA and LLVM? Do any of the Open64 people have interest in integrating with GCC in this way? What are the legal issues and, if there are serious issues, does anyone want to try to resolve them? Again, this should not be read as advocating Open64; these aren't rhetorical questions; I just don't know the answers. There is one advantage I see in the LTO design over LLVM's design. In particular, the LTO proposal envisions a file format that is roughly at the level of GIMPLE. Such a file format could easily be extended to be at the source-level version of Tree used in the front-ends, so that object files could contain two extra sections: one for LTO and one for source-level information. The latter section could be used for things like C++ export -- but, more importantly, for other tools that need source-level information, like IDEs, indexers, checkers, etc. (All tools that presently use the EDG front end would be candidate clients for this interface.) There's a lot of interest in these kinds of tools, and I think their existence would be a competitive advantage for GCC because they would create compelling reasons to use GCC beyond just its capabilities as a compiler. So, at some point, I think we'll probably want (or even need) to add such an interface to GCC. LLVM's bytecode is a flat, three-address code style. That's convenient for optimization, and more compact that Tree, but source-level tools actually want tree data structures, complex expressions, and high-level control-flow primitives (so that they can even do things like distinguish a do-loop from a while-loop). So, it would be a drastic change to try to extend LLVM's bytecode format to present source-level information in this way. Nothing about LLVM is a step backwards from where we are today, with respect to this kind of tool integration. It's just that LLVM doesn't particularly advance us in that direction, whereas the infrastructure for the LTO proposal would facilitate this effort, in addition to just LTO. So, a possible advantage of the LTO proposal in this respect is that it might be a faster path to having both LTO and a source-level interface, and leave us with only one set of routines for reading/writing intermediate code to files. The obvious counter-point is that LLVM is almost certainly a faster path to link-time optimization, since it already works, and that it doesn't in any way prevent us from adding the source-level integration later. The fact