On 09/30/2011 10:25 AM, Kirill Smelkov wrote:
>>> I have it as git branch btw. If anyone is interested I could push
>>> it on repo.or.cz.
>>
>> I thought your current maintainer policy was zero editorial control,
>> just let random strangers form a slush pile in the mob branch and then
>> ship it? (I can't say I've been reading the list very closely.)
>>
>> It's a little disheartening to see issues I fixed over three years ago
>> come up over and over again. If I recall, you refused to port things
>> you either "didn't understand" or didn't see a need for (such as
>> refactoring the code so it wasn't one big tcc.c without even a tcc.h)
>> from my tree the last time I gave up and waited for you guys to just
>> stop it.
>>
>> I put a lot of work into my version, but Fabrice wouldn't hand it the
>> project to anybody who wanted to move the code out of the gnu.org CVS
>> repository, and Linux Weekly News covered releases that went up on
>> tinycc.org, even back when the bulk of the changes in them ported from
>> my version. I wouldn't mind so much if you didn't REFUSE TO TAKE
>> OBVIOUS THINGS that you're now finding a need for all these years later.
>>
>> Sigh. I'm going to go back to ignoring this list now. I should go
>> catch up on the pcc and llvm lists...
>>
>> Rob
>
>
> Rob, thanks for the links and for sharing. Are you in principle ok to
> relicense your changes back from GPL to LGPL so that they could be
> included one way or another into tcc?
Years ago, in my repo, I significantly refactored the code to do things
like put declarations in headers, put architecture-specific code in
subdirectories with consistent names, break out the command line option
parsing into a separate "options.c", and so on. There were a functions
named g() and o() which were IMPOSSIBLE to grep for (leftovers from its
heritage in the obfuscated c code contest, I guess), which I laboriously
replaced with sane names. I started grouping random global variables
into structures, I was working on splitting the preprocessor, compiler,
and linker code into three files... Half of what I did was code CLEANUP.
Grischka had no interest in any of that. Instead he cherry picked a few
commits he understood:
http://lists.gnu.org/archive/html/tinycc-devel/2007-12/msg00040.html
And then told me that I had to abandon my cleanup work and start over on
his tree, and explain everything to his satisfaction (as a Windows guy)
before it could go in. I'm still kind of pissed about it:
http://landley.net/code/tinycc
I.E. his current policy of "go ahead and commit anything you like, I
don't care" isn't what he used to do. It's an overraction in the other
direction.
For the patches in question, I'm pretty sure he _specifically_ told me
that he _didn't_ want colon separated search paths. (Even though I
broke down and made it configurable so it could be semicolon instead for
his windows sensibilities.)
My initial complaint was CVS made it hard for me to work, true:
http://lists.gnu.org/archive/html/tinycc-devel/2008-09/msg00013.html
But I changed the license because the zombie CVS tree was following me,
and I didn't want to encourage it:
http://lists.gnu.org/archive/html/tinycc-devel/2008-09/msg00027.html
But these days, my complaint is that I have no confidence whatsoever in
tinycc's maintainership. It has the tinycc.org domain, and Fabrice
handed over the project, so it is the official final resting place of
tcc. But it's still stagnant, because Fabrice put a Windows developer
in charge of the project, one who apparently does not understand open
source development in the slightest. He's putting out a windows-only
version of tcc as far as I can tell, one which will never build an
unmodified Linux kernel (has made zero progress on this front in the
past _THREE_YEARS_), thus it cannot ever act as (even an infereior) gcc
replacement.
Thus I have no interest in it. I check back every few months to see if
it's _died_ yet, but it's really mostly curiosity at this point.
Someday if I get back to this topic, I want to glue either sparse or the
tcc front end to qemu's tcg back end and produce a new compiler that A)
supports all the targets qemu does, B) can build linux and busybox and
uClibc and itself (thus providing a self-bootstrapping system; I'd
upgrade busybox to have missing bits like "make").
See the attached files for some todo items on that front. But according
to the dates on those files, I last touched them in May 2009, so this
isn't exactly a priority for me. Tinycc isn't dead yet, and there's
plenty of _other_ interesting stuff like the LLVM backend for Sparse:
http://lwn.net/Articles/456709/
> For us, tinycc users, it's a pity to see this issue being stagnated over
> and over again. The CVS is de-facto gone - if it was it, now there is no
> point to block your changes being merged!
Tinycc has been stagnant since Fabrice left in 2005. It saw surges of
effort from me, from David Wheeler's trusting trust paper, the people
who did the arm and x86-64 ports, and grischka's work on windows stuff
(and on getting gcc to build under it)... but it never added _up_ to
anything. I started by collecting together various out of tree patches,
and then poured hundreds of hours of my own effort into it, but nobody
wanted to build on my work becaue it wasn't "official".
Instead a windows developer turned it into a windows-only project, and
he has such a poor understanding of open source development that he
doesn't even do code review or control access to the repository.
Do you have any idea how WRONG that is? Alan Cox once told me "A
maintainer's job is to say no". And he's right: they're like the editor
of a magazine, going through the slush pile to find the best bits,
polish them up, stich them into a coherent next issue, publish, and
repeat. This is not a new task, this is basic editorial judgement that
people have been doing since Gutenberg invented the printing press.
Publishing the raw slush pile is NOT INTERESTING. Fighting off
Sturgeon's Law is what editors _do_. It doesn't matter if we're talking
about the four levels of Linux developers (developer, maintainer,
lieutenant, architect) bouncing back patches with comments, or
Cannonical leaving 98% of sourceforge OFF their install CD, or sites
like Slashdot and Fark publishing a tiny number of headlines from the
thousands they get each day, or Google giving you a page of ten most
interesting hits from an internet approaching a trillion pages of spam,
porn, and emo blogs with cat pictures.
Grischka has not set direction for the project, let alone goals.
Where's the list of problems that need to be solved? Where's the "and
now we build package X" announcements? Where's the RELEASE SCHEDULE?
http://video.google.com/videoplay?docid=-5503858974016723264
This project is still unmaintained, it's just unmaintained by Grischka.
The last release on tinycc.org was TWO AND A HALF YEARS AGO. Even
uClibc never got quite that bad.
> Sometimes people need time to de-cvs'ify themselves and adopt good
> distributed vcs practices. I agree about somewhat funny current mob
> rules, but it would be a very pity again if that would be a blocker for
> cooperation.
I do not trust Grischka's technical judgement, his committment to the
project (I put way more time into my fork than he's put into the
official version), his leadership abilities, or his organizational skills.
I spent three years of my life improving this project (not just coding
but documentation and testing and design and so on), and the result was
esentially discarded at the whim of somebody who DIDN'T UNDERSTAND WHAT
I WAS DOING. The fact the rest of you are now finally realizing that
some of the problems I already solved years ago were, in fact, real
issues, is mildly amusing to me in a morbid way.
If you have competent developers on this lsit you don't NEED my patches,
you can figure out how to do it from the _idea_ in a couple hours. If
you need more details, I discussed the general idea somewhat at length
at the OLS compiler BOF in 2008:
http://free-electrons.com/pub/video/2008/ols/ols2008-rob-landley-linux-compiler.ogg
I doubt my patches will remotely apply to the git codebase anyway, I
changed a _lot_.
Rob
Seven packages. This is to replace binutils and gcc.
FWL needs: ar as nm cc gcc make ld
- Why gcc (shouldn't cc cover it? What builds?)
- Need a make. Separate issue, busybox probably.
Loot tinycc fork to provide:
cc - front-end option parsing
multiplexer (swiss-army-executable ala busybox)
cross-prefix, so check last few chars: cc,ld,ar,as,nm
Calls several automatically (assembler, compiler, linker) as necessary.
Pass on linker options via -Wl,
Merge in FWL wrapper stuff (ccwrap.c)
call out again? distcc support?
Path logic:
compiler includes: ../qcc/include
system includes: ../include
compiler libraries: ../qcc/lib
system libraries: ../lib
tools: built-in (or shell out with same prefix via $PATH)
command line stuff: current directory
ld - linker
#include <elf.h> which qemu already has.
Support for .o, .a, .so -> exe, .so
Support for linker scripts
ar - library archiver
Busybox has partial support (still read-only?)
ranlib?
cc1 - compiler
preprocessor (-E) support
output (.c->.o) support
as - assembler
nm - needed to build something?
binutils provides:
ar as nm ld - already covered
strip, ranlib, addr2line, size, objdump, objcopy - low hanging fruit
readelf - uClibc has one
strings - busybox provides one
Probably not worth it:
gprof - profiling support (optional)
c++filt - C++ and Java, not C.
windmc, dlltool - Windows only (why is it installed on Linux?)
nlmconv - Novell Netware only (why is this installd on Linux?)
QCC - QEMU C Compiler.
Use QEMU's Tiny Code Generator as a backend for a compiler based on my old
fork of Fabrice Bellard's tinycc project.
Why?
QEMU's TCG provides support for many different targets (x86, x86-64, arm,
mips, ppc, sh4, sparc, alpha, m68k, cris). It has an active development
community upgrading and optimizing it.
QEMU application emulation also provides existing support for various ELF
executable and library formats, so linking logic can presumably be merged.
(See elf.h at the top of qemu.) QEMU is also likely to grow coff and pxe
support in future.
Building a self-bootstrapping system:
My Firmware Linux project builds the smallest self-bootstrapping system
I could come up with using the following existing packages:
gcc, binutils, make, bash, busybox, uClibc, linux
This new compiler should replace both binutils and gcc above. (As a smoke
test, the new system should still be able to build all seven packages.)
To build those packages, FWL needs the following commands from the host
toolchain. (It can build everything else from source, but building these
without already having them is a chicken and egg problem.)
ar as nm cc gcc make ld /bin/bash
The reason it needs "gcc" is that the linux and uClibc packages assume
their host compiler is named "gcc", and call that name instead of cc even
when it's not there. (You can mostly override this by specifying HOSTCC=$CC
on the make command line, although a few places need actual source patches.)
Ignoring gcc, make, and bash, this leaves "ar, as, nm, cc, and ld" as
commands qcc needs to provide for a minimal self-bootstrapping system.
Note that the above set of tools is specifically enough to build a fresh
compiler. When building a linux kernel, creating a bzImage requires objcopy,
building qemu requires strip, etc.
What commands does the current gcc/binutils combo provide?
gcc 4.1 provides the commands:
cc/gcc - C compiler
cpp - C preprocessor (equivalent to cc -E)
gcov - coverage tester (optional debugging tool)
Of these, cc is required, cpp is low hanging fruit, and gcov is probably
unnecessary.
Binutils provides:
ar - archiver, creates .a files.
ranlib - generate index to .a archive (equivalent to ar -s)
as - assembler
ld - linker
strip - discard symbols from object files (equilvalent to ld -S)
nm - list symbols from ELF files.
size - show ELF section sizes
objdump - show contents of ELF files
objcopy - copy/translate ELF files
readelf - show contents of ELF files
addr2line - convert addresses to filename/line number (optional debug tool)
strings - show printable characters from binary file
gprof - profiling support (optional)
c++filt - C++ and Java, not C.
windmc, dlltool - Windows only (why is it installed on Linux?)
nlmconv - Novell Netware only (why is this installd on Linux?)
Of these, ar, as, ld, and nm are needed, ranlib, strip, addr2line, and
size are low hanging fruit, size, objdump, obcopy, and readelf are
variants of the same logic as nm, and gprof, c++filt, windmc, dlltool,
and nlmconv are probably unnecessary.
Standards:
The following utilities have SUSv4 pages describing their operation, at
http://www.opengroup.org/onlinepubs/9699919799/utilities
ar, c99, nm, strings
This means the following don't:
ld, cpp, as, ranlib, strip, size, readelf, objdump, objcopy, addr2line
(There isn't a "cc" standard, but you can probably use "c99" for that.)
Existing code:
multiplexer:
The compiler must be provide several different names, yet the same
functionality must be callable from a single compiler executable,
assembling when it encounters embedded assembler, passing on linker
options via "-Wl," to the linking stage, and so on.
The easy way to do this is for the qcc executable to be a swiss-army-knife
executable, like busybox. It needs a command multiplexer which can figure
out which name it was called under and change behavior appropriately, to
act as a compiler, assembler, linker, and so on.
This multiplexer should accept arbitrary prefixes, so cross compiler names
such as "i686-cc" work. This means instead of matching entire known names,
the multiplexer should checks that commands _end_ with recognized strings.
(This would not only allow it to be called as both "qcc" and "cc", but
would have the added bonus of making "gcc" work like "cc" as well.)
Both busybox and tinycc already handle this. Pretty straightforward.
cc/c99 - front-end option parsing
Both tinycc's options.c and ccwrap.c (in FWL) handle command line option
parsing, in different ways. Both take as input the same command line
syntax as gcc, which is more or less the c99 command line syntax from
SUSv4:
http://www.opengroup.org/onlinepubs/9699919799/utilities/c99.html
What ccwrap.c does is rewrite a gcc command line to turn "cc hello.c"
into a big long command line with -L and -I entries, explicitly specifying
header and library paths, the need to link against standard libraries
such as libc, and to link against crt1.o and such as appropriate.
Such a front end option parser could perform such command line rewriting
and then call a "cc1" that contains no built-in knowledge about standard
paths or libraries. This would neatly centralize such behavior, and
if the rewritten command line could actually be extracted it could be
tested against other compilers (such as gcc) to help debugging.
Note that adding distcc or ccache support to such a wrapper is a fairly
straightforward item for future expansion.
The option parser needs to distinguish "compiling" from "linking".
When compiling, the option parser needs to specify two include paths;
one for the compiler (varargs.h, defaulting to ../qcc/include) and
one for the system (stdio.h, defaulting to ../include).
When linking, the option parser needs to specify the compiler library
path (where libqcc.a lives, defaulting to ../qcc/lib), the system
library path (where libc.a lives, defaulting to ../lib), and add
explicit calls to link in the standard libraries and the startup/exit
code. Currently, ccwrap.c does all this.
Note that these default paths aren't relative to the current directory
(which can't change or files listed on the command line wouldn't be found),
but relative to the directory where the qcc executable lives. This allows
the compiler to be relocatable, and thus extracted into a user's home
directory and called from there. (The user's home directory name cannot
be known at compile time.) The defaults can also be specified as absolute
paths when the compiler is configured.
The current ccwrap.c also modifies the $PATH (so gcc's front-end can
shell out to tools such as its own "cc1" and "ld"), and supports C++.
Although qcc doesn't need either of these, both are useful for shelling
out to another compiler (such as gcc).
The wrapper can split "compiling and linking" lines into two commands,
either saving intermediate results in the /tmp directory or forking and
using pipes. (That way cc1 doesn't need to know anything about linking.)
Optionally, the compiler can initialize the same structures used by the
linker, but is the speed/complexity tradeoff here worth it?
Note that "-run" support is actually a property of the linker.
cpp - preprocessor
This performs macro substitution, like "qcc -E".
cc1 - compiler
This compiles C source code. Specifically, it converts one or more .c
files into to a single .o file, for a specific target.
Generating assembly output is best done by running the binary tcg output
through a disassembler. Keep it orthogonal.
ld - linker
This needs to be able to read .o, .a, and .so files, and produce ELF
executables and .so files. It should also support linker scripts.
This needs to "#include <elf.h>", which non-linux hosts won't always have
but which qemu has it's own copy of already.
ar - library archiver
This is a wimpy archiver. It creates .a files from .o files
(and extracts .o files from .a files). It's a flat archive, with no
subdirectories.
Busybox has partial support for this (still read-only, last I checked).
The ranlib command indexes these archives.
SUSv4 has a standards document for this command:
http://www.opengroup.org/onlinepubs/9699919799/utilities/ar.html
as - assembler
Tinycc has an x86 assembler. It should be genericized.
nm - name list
For some reason, gcc won't build without this.
SUSv4 has a standards document for this command:
http://www.opengroup.org/onlinepubs/9699919799/utilities/nm.html
_______________________________________________
Tinycc-devel mailing list
[email protected]
https://lists.nongnu.org/mailman/listinfo/tinycc-devel