> Date: Tue, 24 Sep 2013 05:31:40 +0200 > From: Sylvain BERTRAND <[email protected]> > To: [email protected] > Subject: Re: [Tinycc-devel] inline assembly and optimization passes > Message-ID: <20130924033140.GG754@freedom> > Content-Type: text/plain; charset=us-ascii >
>>>> Thoughts? >>> >>> Wow... :) You totally missed my point. >>> >>> My idea is to have a langage which has a lower implementation >>> technical cost. That's why I was saying "the other way". >> > >> >> Are you talking about the IL/IC, or the other one we were talking >> about? Because I talked about two there, not one. > > The C- one. > So that was in response to the "if I were implementing a language from scratch" bit, right? If I was doing that then I'd be looking at an application language that would easily integrate with a systems language, hence high-end features along with C compatibility. If I was going for a new systems language then I'd just take C, modify the syntax for pointers & declarations, and maybe modify the standard library. Presumably a smaller job than a from-scratch language. >> So you want to hook into the TCC parser itself? > > I said, if this has no obvious blockers, we could use fake > targets that would be optimization passes. They would output C > code. Yeah, I mostly paid attention to the fake-target bit, since outputting IL/IC from that seemed like the easiest way into the "standard route". > Regarding the unused code elimitation across compilation units, > it involves probably the linker. Then the "trick" of the fake > target may not be as easy. That depends, Eliminating unused functions & variables works like that, but it only requires the ability to detect when you're in the file-scope instead of a scope contained within a file, and take that as a signal to place any additional variables or functions into a new section. That and info on what those sections need to import are all that you need, and all of that should (at least presumably, it's been a while since I poked at object file formats) be supported by your ordinary object file format. Or, at least, your ordinary library file format. Once you have that, it's a matter of creating a linker mode that will assign two bits (one for "needed", one for "supplied") in a memory block to each of the sections, and starting a search from your "root section" (probably the one containing the "main" or equivalent function, but possibly a file declaring exports too). Every time that you find a dependency, you ensure that it has either it's "needed" or "supplied" bit set. Once you've finished checking through every section that you ALREADY knew to check, you output it, make a new list of sections to check (they'll be the ones marked "needed" instead of "supplied"), switch all of the "needed" sections to be only "supplied" instead, and start the dependencies search again. You only stop once you run out of sections that are "needed". > We may have to "annotate" the > generated C code for the real target to insert the proper > information in the object file for the linker. I bet that > optimization pass would be kind of the last one. Unused function removal works as I stated above: you find a starting point, find all of it's dependencies, write out the starting point, and recursively check dependencies for new dependencies, and write old dependencies out. Other forms of unused code removal either never leave the compiler (e.g. removing if( 0 ) blocks), or should be left for a later date (some things are more foundational than others). > - A compilation unit scoped dead/unused code removal fake target Let's worry about unused function removal first, since that should be the fastest to implement, okay? Depending on details that Grischka would know but I don't, we might need to build a parse tree before we can eliminate unused code INSIDE of functions, in which case the target will be a "parse tree" output of SOME kind, regardless of whether it's assembly-ish, C-ish, or something-else-ish. Also, by virtue of some of Grischka's comments below, I don't think that just sticking the optimizations entirely inside a fake target will be enough: we'll need to build a parse tree for anything other than the minor stuff, in which case we might as well have the fake target be the parse tree instead of making it be the actual optimization. > - A C code annotation target which create a dependency tree > of machine code sections for the linker to optimize out or > not. Allowing individual sections to have their own dependencies will do this perfectly fine. > Anyway, I don't think I will ever have the time to code any of > this, so... > I don't know how much time you have, but you might compare it to the description that I've given of how to implement unused function removal. Alternatively, I'm going to be testing a pointer-based incremental garbage collector sometime soon (or, more accurately, I'll probably be testing the tree implementation that it requires within the week), so poke me sometime in October to get it done, that way building the parse tree will be easier (it's going to be pretty nice really, because you'll only need to deal with a handful of functions: the system does all of the heavy lifting). > Date: Tue, 24 Sep 2013 10:55:46 +0200 > From: Vittorio Giovara <[email protected]> > To: [email protected] > Subject: [Tinycc-devel] How to get DCE (or something similar) [was: > inline assembly and optimization passes] > Message-ID: > <cablwns85qv7ubvgtgu0dqsfowaras7elzdstpbs0inp6skc...@mail.gmail.com> > Content-Type: text/plain; charset=UTF-8 > > On Tue, Sep 24, 2013 at 4:44 AM, Jared Maddox <[email protected]> > wrote: >>> Date: Mon, 23 Sep 2013 14:31:04 +0200 >>> From: Vittorio Giovara <[email protected]> >>> To: [email protected] >>> Subject: Re: [Tinycc-devel] inline assembly and optimization passes >>> It would be really nice to have some compiler switch (if not >>> integrated) that enable this functionality. >> >> What kind of dead code? If you're talking about getting rid of ifs, >> whiles, etc., that you can theoretically know at compile-time will >> never be used, then they fall into two categories: >> 1) those that are constants entirely within the argument to the >> conditional ( e.g. if( 0 ), and at a higher level if( a / a > 1 ) ), >> and >> 2) those that are constrained to a range of values within the >> conditional that will always evaluate to false, but are not strictly >> constants. >> >> Category 1, and especially it's first sub-case, are relatively easy to >> deal with, and should be possible to optimize out of TCC within the >> current framework. Subcase 2 is a little more complicated, because you >> need to use elimination or similar to determine that one or more >> things are actually just a distraction within the expression, and >> theoretically shouldn't have any effect; however, in the real >> bit-limited world, this can technically have an effect by interfering >> with floating point accuracy calculations, so it should probably be >> placed under a higher-level optimization flag. >> >> Category 2 requires a more in-depth analysis, which is, in turn, >> relevant when inlining functions from other compilation contexts as >> well. Thus, I'd say that both because of it's greater complexity, and >> because of it's utility when operating on the output of other >> compilation stages, treatment for category 2 should be kept separate >> from TCC proper. > > Yes, as you got below I meant removing functions such as > > if (0) > some_function(); > > The "problem" is that many applications use this construct instead of > adding #ifdefs everywhere. So I've heard. Just use #if already, people! It exists for a reason, after all. >>> Date: Mon, 23 Sep 2013 17:05:35 +0200 >>> From: Vittorio Giovara <[email protected]> >>> To: "Thomas Preud'homme" <[email protected]> >>> Cc: [email protected] >>> Subject: Re: [Tinycc-devel] inline assembly and optimization passes >> If, on the other hand, you're talking about dead code elimination in >> the sense of getting rid of functions that aren't used, then see the >> following. This interplays with getting rid of similar code inside of >> functions, but is easier in some ways. > > So, I took a very noob approach and tried the following in tccelf.c: > > 1. don't abort when you find an undefined symbol (gives FPE at execution) > 2. mark every undefined symbol as weak linked (segafaults) > 3. hijack every undefined function to a sink function that does > nothing (couldn't fully implement it) > What was the problem? If it was just how to write the function itself then do it in assembly, have it ignore it's arguments, and have it only call exit() with a value indicating failure. The fact that the function should never be reached means that reaching it should be considered a fatal action 100% of the time. >> Try looking here: >> https://lists.nongnu.org/archive/html/tinycc-devel/2012-10/msg00044.html >> >> The gcc compiler & linker flags mentioned there are directly relevant >> to this, and basically describe the relevant operations: each function >> and variable goes into a separate section within the object file (or, >> in the case of TCC, sometimes just in memory), and either all of the >> sections get stored into a library file, or only the ones actually >> used get stored into an object or executable file. So, multi-step >> process, but the end result is definitely useful. >> >> AFTER that gets done you can worry about invoking the optimization >> with a single flag, since those capabilities are just as useful by >> themselves as they are when grouped together. > > As first step I'd be happy with compiling some code like the one > above, the optimization of actually removing it might come later on. > As for if( 0 ), removing the code is simply how you do the optimization. > Date: Tue, 24 Sep 2013 12:05:56 +0200 > From: grischka <[email protected]> > To: [email protected] > Subject: Re: [Tinycc-devel] inline assembly and optimization passes > Message-ID: <[email protected]> > Content-Type: text/plain; charset=UTF-8; format=flowed > > Jared Maddox wrote: >> decl( ) comes from tccgen.c, and looks like the bottom-level function >> in the parser proper. > > true. > >> I ASSUME that when it finishes, you'll have a >> ready-to-use parse tree in the active TCCState. > > false. TCC translates to machine code immediately while it reads > along the input C: > http://bellard.org/tcc/tcc-doc.html#SEC30 > >> ... then I'd suggest going back to my suggestion of an IL/IC target > > Actually there is a CIL generator in the TinyCC sources that was > written once by Fabrice Bellard: > http://repo.or.cz/w/tinycc.git/blob/refs/heads/mob:/il-gen.c > http://repo.or.cz/w/tinycc.git/blob/refs/heads/mob:/il-opcodes.h > http://en.wikipedia.org/wiki/Common_Intermediate_Language > > However it has fallen behind since then so definitely would need > some care if someone wants to reactivate it. > I did notice that. I considered asking about it, but ultimately decided that it PROBABLY wasn't quite on subject (this might have been influenced by some other notes that I wrote with it). _______________________________________________ Tinycc-devel mailing list [email protected] https://lists.nongnu.org/mailman/listinfo/tinycc-devel
