Re: [Tinycc-devel] Allow configuration of tcc libraries search path

Rob Landley Fri, 30 Sep 2011 11:49:10 -0700

On 09/30/2011 10:25 AM, Kirill Smelkov wrote:
>>> I have it as git branch btw. If anyone is interested I could push
>>> it on repo.or.cz.
>>
>> I thought your current maintainer policy was zero editorial control,
>> just let random strangers form a slush pile in the mob branch and then
>> ship it?  (I can't say I've been reading the list very closely.)
>>
>> It's a little disheartening to see issues I fixed over three years ago
>> come up over and over again.  If I recall, you refused to port things
>> you either "didn't understand" or didn't see a need for (such as
>> refactoring the code so it wasn't one big tcc.c without even a tcc.h)
>> from my tree the last time I gave up and waited for you guys to just
>> stop it.
>>
>> I put a lot of work into my version, but Fabrice wouldn't hand it the
>> project to anybody who wanted to move the code out of the gnu.org CVS
>> repository, and Linux Weekly News covered releases that went up on
>> tinycc.org, even back when the bulk of the changes in them ported from
>> my version.  I wouldn't mind so much if you didn't REFUSE TO TAKE
>> OBVIOUS THINGS that you're now finding a need for all these years later.
>>
>> Sigh.  I'm going to go back to ignoring this list now.  I should go
>> catch up on the pcc and llvm lists...
>>
>> Rob
> 
> 
> Rob, thanks for the links and for sharing. Are you in principle ok to
> relicense your changes back from GPL to LGPL so that they could be
> included one way or another into tcc?


Years ago, in my repo, I significantly refactored the code to do things
like put declarations in headers, put architecture-specific code in
subdirectories with consistent names, break out the command line option
parsing into a separate "options.c", and so on.  There were a functions
named g() and o() which were IMPOSSIBLE to grep for (leftovers from its
heritage in the obfuscated c code contest, I guess), which I laboriously
replaced with sane names.  I started grouping random global variables
into structures, I was working on splitting the preprocessor, compiler,
and linker code into three files... Half of what I did was code CLEANUP.

Grischka had no interest in any of that.  Instead he cherry picked a few
commits he understood:

  http://lists.gnu.org/archive/html/tinycc-devel/2007-12/msg00040.html

And then told me that I had to abandon my cleanup work and start over on
his tree, and explain everything to his satisfaction (as a Windows guy)
before it could go in.  I'm still kind of pissed about it:

  http://landley.net/code/tinycc

I.E. his current policy of "go ahead and commit anything you like, I
don't care" isn't what he used to do.  It's an overraction in the other
direction.

For the patches in question, I'm pretty sure he _specifically_ told me
that he _didn't_ want colon separated search paths.  (Even though I
broke down and made it configurable so it could be semicolon instead for
his windows sensibilities.)

My initial complaint was CVS made it hard for me to work, true:

  http://lists.gnu.org/archive/html/tinycc-devel/2008-09/msg00013.html

But I changed the license because the zombie CVS tree was following me,
and I didn't want to encourage it:

  http://lists.gnu.org/archive/html/tinycc-devel/2008-09/msg00027.html

But these days, my complaint is that I have no confidence whatsoever in
tinycc's maintainership.  It has the tinycc.org domain, and Fabrice
handed over the project, so it is the official final resting place of
tcc.  But it's still stagnant, because Fabrice put a Windows developer
in charge of the project, one who apparently does not understand open
source development in the slightest. He's putting out a windows-only
version of tcc as far as I can tell, one which will never build an
unmodified Linux kernel (has made zero progress on this front in the
past _THREE_YEARS_), thus it cannot ever act as (even an infereior) gcc
replacement.

Thus I have no interest in it.  I check back every few months to see if
it's _died_ yet, but it's really mostly curiosity at this point.

Someday if I get back to this topic, I want to glue either sparse or the
tcc front end to qemu's tcg back end and produce a new compiler that A)
supports all the targets qemu does, B) can build linux and busybox and
uClibc and itself (thus providing a self-bootstrapping system; I'd
upgrade busybox to have missing bits like "make").

See the attached files for some todo items on that front.  But according
to the dates on those files, I last touched them in May 2009, so this
isn't exactly a priority for me.  Tinycc isn't dead yet, and there's
plenty of _other_ interesting stuff like the LLVM backend for Sparse:

  http://lwn.net/Articles/456709/

> For us, tinycc users, it's a pity to see this issue being stagnated over
> and over again. The CVS is de-facto gone - if it was it, now there is no
> point to block your changes being merged!

Tinycc has been stagnant since Fabrice left in 2005.  It saw surges of
effort from me, from David Wheeler's trusting trust paper, the people
who did the arm and x86-64 ports, and grischka's work on windows stuff
(and on getting gcc to build under it)... but it never added _up_ to
anything.  I started by collecting together various out of tree patches,
and then poured hundreds of hours of my own effort into it, but nobody
wanted to build on my work becaue it wasn't "official".

Instead a windows developer turned it into a windows-only project, and
he has such a poor understanding of open source development that he
doesn't even do code review or control access to the repository.

Do you have any idea how WRONG that is?  Alan Cox once told me "A
maintainer's job is to say no".  And he's right: they're like the editor
of a magazine, going through the slush pile to find the best bits,
polish them up, stich them into a coherent next issue, publish, and
repeat.  This is not a new task, this is basic editorial judgement that
people have been doing since Gutenberg invented the printing press.

Publishing the raw slush pile is NOT INTERESTING.  Fighting off
Sturgeon's Law is what editors _do_.  It doesn't matter if we're talking
about the four levels of Linux developers (developer, maintainer,
lieutenant, architect) bouncing back patches with comments, or
Cannonical leaving 98% of sourceforge OFF their install CD, or sites
like Slashdot and Fark publishing a tiny number of headlines from the
thousands they get each day, or Google giving you a page of ten most
interesting hits from an internet approaching a trillion pages of spam,
porn, and emo blogs with cat pictures.

Grischka has not set direction for the project, let alone goals.
Where's the list of problems that need to be solved?  Where's the "and
now we build package X" announcements?  Where's the RELEASE SCHEDULE?

  http://video.google.com/videoplay?docid=-5503858974016723264

This project is still unmaintained, it's just unmaintained by Grischka.
 The last release on tinycc.org was TWO AND A HALF YEARS AGO.  Even
uClibc never got quite that bad.

> Sometimes people need time to de-cvs'ify themselves and adopt good
> distributed vcs practices.  I agree about somewhat funny current mob
> rules, but it would be a very pity again if that would be a blocker for
> cooperation.

I do not trust Grischka's technical judgement, his committment to the
project (I put way more time into my fork than he's put into the
official version), his leadership abilities, or his organizational skills.

I spent three years of my life improving this project (not just coding
but documentation and testing and design and so on), and the result was
esentially discarded at the whim of somebody who DIDN'T UNDERSTAND WHAT
I WAS DOING.  The fact the rest of you are now finally realizing that
some of the problems I already solved years ago were, in fact, real
issues, is mildly amusing to me in a morbid way.

If you have competent developers on this lsit you don't NEED my patches,
you can figure out how to do it from the _idea_ in a couple hours.  If
you need more details, I discussed the general idea somewhat at length
at the OLS compiler BOF in 2008:


http://free-electrons.com/pub/video/2008/ols/ols2008-rob-landley-linux-compiler.ogg

I doubt my patches will remotely apply to the git codebase anyway, I
changed a _lot_.

Rob

Seven packages.  This is to replace binutils and gcc.

FWL needs: ar as nm cc gcc make ld
  - Why gcc (shouldn't cc cover it?  What builds?)
  - Need a make.  Separate issue, busybox probably.

Loot tinycc fork to provide:

  cc - front-end option parsing
    multiplexer (swiss-army-executable ala busybox)
      cross-prefix, so check last few chars: cc,ld,ar,as,nm

    Calls several automatically (assembler, compiler, linker) as necessary.
      Pass on linker options via -Wl,

    Merge in FWL wrapper stuff (ccwrap.c)
      call out again?  distcc support?

    Path logic:
      compiler includes: ../qcc/include
      system includes: ../include
      compiler libraries: ../qcc/lib
      system libraries: ../lib
      tools: built-in (or shell out with same prefix via $PATH)
      command line stuff: current directory

  ld - linker
    #include <elf.h> which qemu already has.
    Support for .o, .a, .so -> exe, .so
    Support for linker scripts

  ar - library archiver
    Busybox has partial support (still read-only?)
    ranlib?

  cc1 - compiler
    preprocessor (-E) support
    output (.c->.o) support

  as - assembler

  nm - needed to build something?

binutils provides:
  ar as nm ld - already covered
  strip, ranlib, addr2line, size, objdump, objcopy - low hanging fruit
  readelf - uClibc has one
  strings - busybox provides one 

  Probably not worth it:
    gprof - profiling support (optional)
    c++filt - C++ and Java, not C.
    windmc, dlltool - Windows only (why is it installed on Linux?)
    nlmconv - Novell Netware only (why is this installd on Linux?)

QCC - QEMU C Compiler.

  Use QEMU's Tiny Code Generator as a backend for a compiler based on my old
  fork of Fabrice Bellard's tinycc project.

Why?

  QEMU's TCG provides support for many different targets (x86, x86-64, arm,
  mips, ppc, sh4, sparc, alpha, m68k, cris).  It has an active development
  community upgrading and optimizing it.

  QEMU application emulation also provides existing support for various ELF
  executable and library formats, so linking logic can presumably be merged.
  (See elf.h at the top of qemu.)  QEMU is also likely to grow coff and pxe
  support in future.

Building a self-bootstrapping system:

  My Firmware Linux project builds the smallest self-bootstrapping system
  I could come up with using the following existing packages:

    gcc, binutils, make, bash, busybox, uClibc, linux

  This new compiler should replace both binutils and gcc above.  (As a smoke
  test, the new system should still be able to build all seven packages.)

  To build those packages, FWL needs the following commands from the host
  toolchain.  (It can build everything else from source, but building these
  without already having them is a chicken and egg problem.)

    ar as nm cc gcc make ld /bin/bash

  The reason it needs "gcc" is that the linux and uClibc packages assume
  their host compiler is named "gcc", and call that name instead of cc even
  when it's not there.  (You can mostly override this by specifying HOSTCC=$CC
  on the make command line, although a few places need actual source patches.)

  Ignoring gcc, make, and bash, this leaves "ar, as, nm, cc, and ld" as
  commands qcc needs to provide for a minimal self-bootstrapping system.

  Note that the above set of tools is specifically enough to build a fresh
  compiler.  When building a linux kernel, creating a bzImage requires objcopy,
  building qemu requires strip, etc.

What commands does the current gcc/binutils combo provide?

  gcc 4.1 provides the commands:
    cc/gcc - C compiler
    cpp - C preprocessor (equivalent to cc -E)
    gcov - coverage tester (optional debugging tool)

    Of these, cc is required, cpp is low hanging fruit, and gcov is probably
    unnecessary.

  Binutils provides:
    ar - archiver, creates .a files.
    ranlib - generate index to .a archive (equivalent to ar -s)
    as - assembler
    ld - linker
    strip - discard symbols from object files (equilvalent to ld -S)
    nm - list symbols from ELF files.
    size - show ELF section sizes
    objdump - show contents of ELF files
    objcopy - copy/translate ELF files
    readelf - show contents of ELF files
    addr2line - convert addresses to filename/line number (optional debug tool)
    strings - show printable characters from binary file
    gprof - profiling support (optional)
    c++filt - C++ and Java, not C.
    windmc, dlltool - Windows only (why is it installed on Linux?)
    nlmconv - Novell Netware only (why is this installd on Linux?)

    Of these, ar, as, ld, and nm are needed, ranlib, strip, addr2line, and
    size are low hanging fruit, size, objdump, obcopy, and readelf are
    variants of the same logic as nm, and gprof, c++filt, windmc, dlltool,
    and nlmconv are probably unnecessary.

Standards:

  The following utilities have SUSv4 pages describing their operation, at
  http://www.opengroup.org/onlinepubs/9699919799/utilities

    ar, c99, nm, strings

  This means the following don't:

    ld, cpp, as, ranlib, strip, size, readelf, objdump, objcopy, addr2line

  (There isn't a "cc" standard, but you can probably use "c99" for that.)

Existing code:

  multiplexer:

    The compiler must be provide several different names, yet the same
    functionality must be callable from a single compiler executable,
    assembling when it encounters embedded assembler, passing on linker
    options via "-Wl," to the linking stage, and so on.

    The easy way to do this is for the qcc executable to be a swiss-army-knife
    executable, like busybox.  It needs a command multiplexer which can figure
    out which name it was called under and change behavior appropriately, to
    act as a compiler, assembler, linker, and so on.

    This multiplexer should accept arbitrary prefixes, so cross compiler names
    such as "i686-cc" work.  This means instead of matching entire known names,
    the multiplexer should checks that commands _end_  with recognized strings.
    (This would not only allow it to be called as both "qcc" and "cc", but
    would have the added bonus of making "gcc" work like "cc" as well.)

    Both busybox and tinycc already handle this.  Pretty straightforward.

  cc/c99 - front-end option parsing

    Both tinycc's options.c and ccwrap.c (in FWL) handle command line option
    parsing, in different ways.  Both take as input the same command line
    syntax as gcc, which is more or less the c99 command line syntax from
    SUSv4:

      http://www.opengroup.org/onlinepubs/9699919799/utilities/c99.html

    What ccwrap.c does is rewrite a gcc command line to turn "cc hello.c"
    into a big long command line with -L and -I entries, explicitly specifying
    header and library paths, the need to link against standard libraries
    such as libc, and to link against crt1.o and such as appropriate.

    Such a front end option parser could perform such command line rewriting
    and then call a "cc1" that contains no built-in knowledge about standard
    paths or libraries.  This would neatly centralize such behavior, and
    if the rewritten command line could actually be extracted it could be
    tested against other compilers (such as gcc) to help debugging.

    Note that adding distcc or ccache support to such a wrapper is a fairly
    straightforward item for future expansion.

    The option parser needs to distinguish "compiling" from "linking".

      When compiling, the option parser needs to specify two include paths;
      one for the compiler (varargs.h, defaulting to ../qcc/include) and
      one for the system (stdio.h, defaulting to ../include).

      When linking, the option parser needs to specify the compiler library
      path (where libqcc.a lives, defaulting to ../qcc/lib), the system
      library path (where libc.a lives, defaulting to ../lib), and add
      explicit calls to link in the standard libraries and the startup/exit
      code.  Currently, ccwrap.c does all this.

    Note that these default paths aren't relative to the current directory
    (which can't change or files listed on the command line wouldn't be found),
    but relative to the directory where the qcc executable lives.  This allows
    the compiler to be relocatable, and thus extracted into a user's home
    directory and called from there.  (The user's home directory name cannot
    be known at compile time.)  The defaults can also be specified as absolute
    paths when the compiler is configured.

    The current ccwrap.c also modifies the $PATH (so gcc's front-end can
    shell out to tools such as its own "cc1" and "ld"), and supports C++.
    Although qcc doesn't need either of these, both are useful for shelling
    out to another compiler (such as gcc).

    The wrapper can split "compiling and linking" lines into two commands,
    either saving intermediate results in the /tmp directory or forking and
    using pipes.  (That way cc1 doesn't need to know anything about linking.)
    Optionally, the compiler can initialize the same structures used by the
    linker, but is the speed/complexity tradeoff here worth it?

    Note that "-run" support is actually a property of the linker.

  cpp - preprocessor

    This performs macro substitution, like "qcc -E".

  cc1 - compiler

    This compiles C source code.  Specifically, it converts one or more .c
    files into to a single .o file, for a specific target.

    Generating assembly output is best done by running the binary tcg output
    through a disassembler.  Keep it orthogonal.

  ld - linker
    This needs to be able to read .o, .a, and .so files, and produce ELF
    executables and .so files.  It should also support linker scripts.

    This needs to "#include <elf.h>", which non-linux hosts won't always have
    but which qemu has it's own copy of already.

  ar - library archiver
    This is a wimpy archiver.  It creates .a files from .o files
    (and extracts .o files from .a files).  It's a flat archive, with no
    subdirectories.

    Busybox has partial support for this (still read-only, last I checked).

    The ranlib command indexes these archives.

    SUSv4 has a standards document for this command:

      http://www.opengroup.org/onlinepubs/9699919799/utilities/ar.html

  as - assembler
    Tinycc has an x86 assembler.  It should be genericized.

  nm - name list

    For some reason, gcc won't build without this.

    SUSv4 has a standards document for this command:

      http://www.opengroup.org/onlinepubs/9699919799/utilities/nm.html

_______________________________________________
Tinycc-devel mailing list
[email protected]
https://lists.nongnu.org/mailman/listinfo/tinycc-devel

Re: [Tinycc-devel] Allow configuration of tcc libraries search path

Reply via email to