> Date: Sun, 15 Dec 2013 10:56:06 +0000 (GMT) > From: Rob <[email protected]> > To: [email protected] > Subject: Re: [Tinycc-devel] RE :Re: inline functions > Message-ID: <alpine.DEB.2.02.1312151050210.19371@jeffraw> > Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed > > On Wed, 11 Dec 2013, Jared Maddox wrote: > >>> Date: Wed, 11 Dec 2013 23:27:37 +0000 (GMT) >>> From: Rob <[email protected]> >>>
>> A flag won't help, I'm pretty sure this is an inherent limit of TCC. >> The lack of stored parse information about functions means that the >> information that TCC would need to perform "proper" inlining just >> isn't available to the compiler when the inlining must be done, >> because it isn't kept anywhere. > > Ah okay, I was thinking of testing for VT_INLINE and if extern and > static aren't present, we skip that definition, and just don't do any > inlining for now. > Yeah, as you might be guessing right now, "undefined reference" unless it's also unused. >> There are a few ways to change this: >> >> 1) When I finish my current pet project (which is taking longer than I >> planned, due to sluggishness on my part) I'm planning to write a >> reference-loop-safe smart-pointer system in C, and use it to add a >> "parse tree generator" backend to TCC. At that point a parse tree will >> be trivially available, and the problem can be dealt with by a >> function that receives said parse tree, before passing it on to a >> parse tree -to- code stage. If you think that this feature is half-way >> time critical, then don't wait for me to do this. >> >> 2) We could always cheat by coming up with some custom function format >> based upon working "on top" of the caller's stack, and claim that >> we've inlined in that way (we might even be able to do a copy-paste >> into the correct location on some platforms). >> >> 3) We could come up with a "raw function" object format that describes >> the calling convention, alignment requirements, and "patch >> requirements" required to copy the enclosed inlineable function into >> another function. Nesting of inline functions wouldn't work if it >> hadn't already been done to the inlineable function, but no big >> surprise there. Note that this could technically be done with a copy >> of the source code & information about what globals it can see, >> instead of with a proper object file. > > I'd be interested in your smart-pointer system, do you have a blog or > anything we can follow? I could say yes, but I'll say no, because it's been at least six months since I used it, at which point it was wholly dedicated to the kind of minutia that technical schools have you post when you're in the very early stages of their programs. As it is, the biggest blocker is currently a software renderer pet project which I had intended to finish a month or two ago, but let slip instead. The only thing that I recall actually needing for the smart-pointers is a good search tree, and I was figuring that I could get one up, and then macro-ed, within a week once I started working on it. Everything else was just working out the details (anything of this sort will, of course, start with subtle bugs: given that I've had a desire for this sort of system for years, it's basically a given that I'll be making it work, since the parse tree will just be the first use of it, not the only one). I also want to use it for a bencode parser, an MSCOM-knockoff (but with better reference tracking), a custom language, and doubtlessly other things as well. > But yeah, I imagine getting tcc to inline > functions will be quite the overhaul. You're brave! > Well, to be fair, I'm not going after inlining, or even any use of the parse tree itself at all. An older list subject was on compiling C to C to produce optimizations, which is as far as I know pretty nonsensical. But, I DID figure that the TCC parse system is simple enough that I could understand it in a sensible amount of time, and hence could use it to build a parse tree for anyone else to use, which in turn would allow future projects of that sort to have a better starting point: a parse tree. > I wonder if it would be possible to combine your patching idea and just > run the vstack manipulation that the inline function does, but on the > current function's vstack. The difficulty would be getting symbol > references correct, we'd have to figure out a nice way of mapping > argument symbols onto the rvalues we pass into the inline function. > If I were doing it, here's how I'd approach it: 1) I'd figure out how the code needed to be generated so that it didn't care what I did with it: we're talking about full-sized pointers instead of some optimized forms, always assembly long-jumps and never assembly short-jumps, and some way to correctly designate stack positions DESPITE borrowing part of the stack space of another function. The pointers shouldn't be hard (in fact, I assume they're already like that), the jumps might require a bit of work but should be solvable by temporarily throwing the tokens through a specialized "inlines parser" that just changes the type of jump opcode issued (short jumps are size-efficient, but only offer something like 127-byte jumps), and the stack positions could be via either stack pointer, or via a TWO-offsets-from-stack-base system. 2) Once you have the inlinable-assembly generator ready, you need your meta-data. This consists of: the base-address (within the "object file", which I probably would implement as a file, albeit a temporary one), length, and alignment requirements of the assembly (only case that you'd need it for that I KNOW of is if we want to target Google NaCl in the future, but I wouldn't be surprised at actual processors having requirements); the length (in entries or bytes, whichever) of the patch table, and a list of assembly-address/modifier/type triplets to specify the address to be modified within the inlinable assembly (assumed to be the same size as native pointers), a necessary piece of data to be used in the modification (for an argument or an invocation variable you'd need an offset), and the type of modification (i.e. is the location an argument/variable, a thread-local, a function-static, an offset to some location within the inlinable assembly itself, something else that doesn't come to mind, etc.); the string-table (string tables apparently have pretty standard formats: they're seemingly always a set of C strings followed by an extra null); and a table of patch-tables, indexed via the string table (i.e., the first entry corresponds to the first string, etc.), where each patch-table consists of the length, a type to apply to the whole patch-table, and a list of assembly-address/modifier pairs. Note that the first piece of meta-data that I described allows the actual inlinable-assembly to be placed anywhere: I would probably initially write it into a "scratch" file while the patch tables & such were being written, and then just copy it in, and modify the base address initially written to the "storage" file. 3) That having been done, all calls to the inline would then consist of copying the assembly to the current object-file output, modifying the appropriate locations according to the information in the patch-tables as you go along. Note that there are certain to be plenty of details: I haven't said whether you modify the argument accesses to point to the locations where their originating values come from (note that I actually oppose this: it strikes me as a bad idea), or instead push copies of those values onto the stack as if you were actually calling the function (note that this can actually be done via the redirect-to-origin option that I'm suggesting against). You also need some way to pop out a conventional form if it's address gets taken, of course, but I expect that you can do that with one bit per inline function, and a final check to see if the bit has been set (if it has, then spit out a relevant prologue, inline the code into that function as if it had been called, pop out an epilogue, and you're done). Note that this whole scheme does result in a calling convention: inline convention, which I would suggest deriving from the platform's C calling convention. Also, I just remembered that return information needs to be present in the inlinable-assembly as well. This should be achievable with an appropriate entry in the symbol table (I'd suggest against plain "return" as the symbol, since we might someday see arbitrary-identifier features added to C: some sort of escape-code would be better instead). The return itself is best achieved with a jump. You will, naturally, have to go back through the final output code after the inlining to provide the correct jump target: just keep the inlinable-assembly temp file open, and track the address where you most recently started writing the inlinable into the output. >> If such a thing were done, then I would ALSO suggest putting in a >> compiler flag to force inlines to be implemented as statics, like they >> currently are (or at least should presumably be). > > Yeah, perhaps using attributes, like force_inline and noinline from gcc? > I was thinking a command-line argument to apply it to the entire file, but I think that implementing your version and then having a command-line argument to set it as the default behavior would be the better solution (it would also help to play nicer on platforms where inlining support is being worked on, but isn't yet available). _______________________________________________ Tinycc-devel mailing list [email protected] https://lists.nongnu.org/mailman/listinfo/tinycc-devel
