> Date: Wed, 10 Oct 2012 14:53:18 -0400 > From: Milutin Jovanovi? <[email protected]> > To: [email protected] > Subject: Re: [Tinycc-devel] TCC and "smart" linking > Message-ID: > <cansc6g5mf4yuh-qzbpo4osrss115vuqv2tejnt0+98bok8w...@mail.gmail.com> > Content-Type: text/plain; charset="utf-8" > > I am little surprised that the tone of this thread has turned negative. It > is really not clear what the argument is about. Peoples preferences? >
As far as I can tell, Oleg is angry that people haven't quickly jumped onto his suggestion and embraced it. No big deal, happens to lots of folks. > I cannot imagine that marking symbols as referenced to be expensive... If > it was as simple as not outputting into target executable symbols that are > not marked as used, this would be a truly simple task. And fast, while > we're on it. However, the problem arises when linking with libraries, which > is already compiled code. When including a symbol from a library, a > mechanism needs to be created to determine which symbols this routine > references, and then recursively repeat the process. I admit that I am not > expert on binary and library formats... The only way I can think of doing > this is to examine relocation tables, and hopefully extract/deduce this > information from it. However I don't know if this is even possible. So, > this could be a non trivial and not so fast operation. > Realistically, you need to separate out everything that can be REFERED TO by a symbol into it's own labeled portion of the object file. Dropping symbols is easy, but dropping what those symbols refer to requires that those things (whether code or variables) have extra data stored somewhere to indicate where they start and end. Inline code further complicates the issue, though only within certain bounds. I actually have some (untested) code that does some related work: this problem can be treated as a matter of garbage collection. If you bring a representation of an object file into memory, and copy all of the 'expected portions' of the object file (e.g. the main function, any entry & exit code expected for the target, etc.), as well as all of the portions of the object file that you can detect are directly or indirectly reachable from those expected portions (e.g. functions referred to in main), then you'll have exactly what Oleg's wanting. However, this requires that you store sufficient information to separate out these portions, which I wouldn't assume that TCC does. Additionally, this WON'T play nicely with anything that expects to use platform-native tools or some understanding of it's executable's data structure to access sections of code or data at run-time without symbols or relocation data. Also, this actually isn't the job of an actual compiler (the compiler only needs to make certain the necessary data is available), pruning unneeded functions actually needs to happen in the linker (because only the linker can know whether a symbol is REALLY unreferenced). > Regarding the comparison to Turbo Pascal, if I remember correctly, they > used their own library format, and it might contain this dependency > information that can be used to quickly eliminate unused code. tcc however > needs to use standard formats. > Indeed. > All in all, the problem does not seem trivial, and is made worse by the > fact that it needs to work with multiple file formats/platform. Judging by > activity of this project, it seems little ambitious. > > Miki. > Very much so. Ironically, I wouldn't be opposed to building such a system on top of TCC (if there was a TCC 'native' library project I would have at least donated a balanced tree implementation by now), but I wouldn't be willing to incorporate it into TCC itself, and I'm not certain I would want to work on a project with someone who acts as domineering as Oleg (he comes across as a screamer). _______________________________________________ Tinycc-devel mailing list [email protected] https://lists.nongnu.org/mailman/listinfo/tinycc-devel
