Re: [Tinycc-devel] inline assembly and optimization passes

Jared Maddox Wed, 25 Sep 2013 19:25:05 -0700

> Date: Tue, 24 Sep 2013 05:31:40 +0200
> From: Sylvain BERTRAND <[email protected]>
> To: [email protected]
> Subject: Re: [Tinycc-devel] inline assembly and optimization passes
> Message-ID: <20130924033140.GG754@freedom>
> Content-Type: text/plain; charset=us-ascii
>

>>>> Thoughts?
>>>
>>> Wow... :) You totally missed my point.
>>>
>>> My idea is to have a langage which has a lower implementation
>>> technical cost. That's why I was saying "the other way".
>> >
>>
>> Are you talking about the IL/IC, or the other one we were talking
>> about? Because I talked about two there, not one.
>
> The C- one.
>

So that was in response to the "if I were implementing a language from
scratch" bit, right? If I was doing that then I'd be looking at an
application language that would easily integrate with a systems
language, hence high-end features along with C compatibility.

If I was going for a new systems language then I'd just take C, modify
the syntax for pointers & declarations, and maybe modify the standard
library. Presumably a smaller job than a from-scratch language.

>> So you want to hook into the TCC parser itself?
>
> I said, if this has no obvious blockers, we could use fake
> targets that would be optimization passes. They would output C
> code.

Yeah, I mostly paid attention to the fake-target bit, since outputting
IL/IC from that seemed like the easiest way into the "standard route".

> Regarding the unused code elimitation across compilation units,
> it involves probably the linker. Then the "trick" of the fake
> target may not be as easy.

That depends, Eliminating unused functions & variables works like
that, but it only requires the ability to detect when you're in the
file-scope instead of a scope contained within a file, and take that
as a signal to place any additional variables or functions into a new
section. That and info on what those sections need to import are all
that you need, and all of that should (at least presumably, it's been
a while since I poked at object file formats) be supported by your
ordinary object file format. Or, at least, your ordinary library file
format.

Once you have that, it's a matter of creating a linker mode that will
assign two bits (one for "needed", one for "supplied") in a memory
block to each of the sections, and starting a search from your "root
section" (probably the one containing the "main" or equivalent
function, but possibly a file declaring exports too). Every time that
you find a dependency, you ensure that it has either it's "needed" or
"supplied" bit set. Once you've finished checking through every
section that you ALREADY knew to check, you output it, make a new list
of sections to check (they'll be the ones marked "needed" instead of
"supplied"), switch all of the "needed" sections to be only "supplied"
instead, and start the dependencies search again. You only stop once
you run out of sections that are "needed".

> We may have to "annotate" the
> generated C code for the real target to insert the proper
> information in the object file for the linker. I bet that
> optimization pass would be kind of the last one.

Unused function removal works as I stated above: you find a starting
point, find all of it's dependencies, write out the starting point,
and recursively check dependencies for new dependencies, and write old
dependencies out.

Other forms of unused code removal either never leave the compiler
(e.g. removing if( 0 ) blocks), or should be left for a later date
(some things are more foundational than others).

>  - A compilation unit scoped dead/unused code removal fake target

Let's worry about unused function removal first, since that should be
the fastest to implement, okay? Depending on details that Grischka
would know but I don't, we might need to build a parse tree before we
can eliminate unused code INSIDE of functions, in which case the
target will be a "parse tree" output of SOME kind, regardless of
whether it's assembly-ish, C-ish, or something-else-ish.

Also, by virtue of some of Grischka's comments below, I don't think
that just sticking the optimizations entirely inside a fake target
will be enough: we'll need to build a parse tree for anything other
than the minor stuff, in which case we might as well have the fake
target be the parse tree instead of making it be the actual
optimization.

>  - A C code annotation target which create a dependency tree
>    of machine code sections for the linker to optimize out or
>    not.

Allowing individual sections to have their own dependencies will do
this perfectly fine.

> Anyway, I don't think I will ever have the time to code any of
> this, so...
>

I don't know how much time you have, but you might compare it to the
description that I've given of how to implement unused function
removal. Alternatively, I'm going to be testing a pointer-based
incremental garbage collector sometime soon (or, more accurately, I'll
probably be testing the tree implementation that it requires within
the week), so poke me sometime in October to get it done, that way
building the parse tree will be easier (it's going to be pretty nice
really, because you'll only need to deal with a handful of functions:
the system does all of the heavy lifting).

> Date: Tue, 24 Sep 2013 10:55:46 +0200
> From: Vittorio Giovara <[email protected]>
> To: [email protected]
> Subject: [Tinycc-devel] How to get DCE (or something similar) [was:
>       inline assembly and optimization passes]
> Message-ID:
>       <cablwns85qv7ubvgtgu0dqsfowaras7elzdstpbs0inp6skc...@mail.gmail.com>
> Content-Type: text/plain; charset=UTF-8
>
> On Tue, Sep 24, 2013 at 4:44 AM, Jared Maddox <[email protected]>
> wrote:
>>> Date: Mon, 23 Sep 2013 14:31:04 +0200
>>> From: Vittorio Giovara <[email protected]>
>>> To: [email protected]
>>> Subject: Re: [Tinycc-devel] inline assembly and optimization passes
>>> It would be really nice to have some compiler switch (if not
>>> integrated) that enable this functionality.
>>
>> What kind of dead code? If you're talking about getting rid of ifs,
>> whiles, etc., that you can theoretically know at compile-time will
>> never be used, then they fall into two categories:
>> 1) those that are constants entirely within the argument to the
>> conditional ( e.g. if( 0 ), and at a higher level if( a / a > 1 ) ),
>> and
>> 2) those that are constrained to a range of values within the
>> conditional that will always evaluate to false, but are not strictly
>> constants.
>>
>> Category 1, and especially it's first sub-case, are relatively easy to
>> deal with, and should be possible to optimize out of TCC within the
>> current framework. Subcase 2 is a little more complicated, because you
>> need to use elimination or similar to determine that one or more
>> things are actually just a distraction within the expression, and
>> theoretically shouldn't have any effect; however, in the real
>> bit-limited world, this can technically have an effect by interfering
>> with floating point accuracy calculations, so it should probably be
>> placed under a higher-level optimization flag.
>>
>> Category 2 requires a more in-depth analysis, which is, in turn,
>> relevant when inlining functions from other compilation contexts as
>> well. Thus, I'd say that both because of it's greater complexity, and
>> because of it's utility when operating on the output of other
>> compilation stages, treatment for category 2 should be kept separate
>> from TCC proper.
>
> Yes, as you got below I meant removing functions such as
>
> if (0)
>    some_function();
>
> The "problem" is that many applications use this construct instead of
> adding #ifdefs everywhere.

So I've heard. Just use #if already, people! It exists for a reason, after all.

>>> Date: Mon, 23 Sep 2013 17:05:35 +0200
>>> From: Vittorio Giovara <[email protected]>
>>> To: "Thomas Preud'homme" <[email protected]>
>>> Cc: [email protected]
>>> Subject: Re: [Tinycc-devel] inline assembly and optimization passes
>> If, on the other hand, you're talking about dead code elimination in
>> the sense of getting rid of functions that aren't used, then see the
>> following. This interplays with getting rid of similar code inside of
>> functions, but is easier in some ways.
>
> So, I took a very noob approach and tried the following in tccelf.c:
>
> 1. don't abort when you find an undefined symbol (gives FPE at execution)
> 2. mark every undefined symbol as weak linked (segafaults)
> 3. hijack every undefined function to a sink function that does
> nothing (couldn't fully implement it)
>

What was the problem? If it was just how to write the function itself
then do it in assembly, have it ignore it's arguments, and have it
only call exit() with a value indicating failure. The fact that the
function should never be reached means that reaching it should be
considered a fatal action 100% of the time.

>> Try looking here:
>> https://lists.nongnu.org/archive/html/tinycc-devel/2012-10/msg00044.html
>>
>> The gcc compiler & linker flags mentioned there are directly relevant
>> to this, and basically describe the relevant operations: each function
>> and variable goes into a separate section within the object file (or,
>> in the case of TCC, sometimes just in memory), and either all of the
>> sections get stored into a library file, or only the ones actually
>> used get stored into an object or executable file. So, multi-step
>> process, but the end result is definitely useful.
>>
>> AFTER that gets done you can worry about invoking the optimization
>> with a single flag, since those capabilities are just as useful by
>> themselves as they are when grouped together.
>
> As first step I'd be happy with compiling some code like the one
> above, the optimization of actually removing it might come later on.
>

As for if( 0 ), removing the code is simply how you do the optimization.

> Date: Tue, 24 Sep 2013 12:05:56 +0200
> From: grischka <[email protected]>
> To: [email protected]
> Subject: Re: [Tinycc-devel] inline assembly and optimization passes
> Message-ID: <[email protected]>
> Content-Type: text/plain; charset=UTF-8; format=flowed
>
> Jared Maddox wrote:
>> decl(  ) comes from tccgen.c, and looks like the bottom-level function
>> in the parser proper.
>
> true.
>
>> I ASSUME that when it finishes, you'll have a
>> ready-to-use parse tree in the active TCCState.
>
> false.  TCC translates to machine code immediately while it reads
> along the input C:
>      http://bellard.org/tcc/tcc-doc.html#SEC30
>
>> ... then I'd suggest going back to my suggestion of an IL/IC target
>
> Actually there is a CIL generator in the TinyCC sources that was
> written once by Fabrice Bellard:
>      http://repo.or.cz/w/tinycc.git/blob/refs/heads/mob:/il-gen.c
>      http://repo.or.cz/w/tinycc.git/blob/refs/heads/mob:/il-opcodes.h
>      http://en.wikipedia.org/wiki/Common_Intermediate_Language
>
> However it has fallen behind since then so definitely would need
> some care if someone wants to reactivate it.
>

I did notice that. I considered asking about it, but ultimately
decided that it PROBABLY wasn't quite on subject (this might have been
influenced by some other notes that I wrote with it).

_______________________________________________
Tinycc-devel mailing list
[email protected]
https://lists.nongnu.org/mailman/listinfo/tinycc-devel

Re: [Tinycc-devel] inline assembly and optimization passes

Reply via email to