Re: [Tinycc-devel] JIT compiler efficiency

Joshua Scholar Sun, 27 Dec 2020 02:02:24 -0800

I've only been playing with libtcc for a week, so I don't have all the
answers, but I am interested in a similar use.  You might be interested in
the questions I've asked and the answers I got.

My impressions so far:
1) tcc is a c compiler, and doesn't have any features added to make it
suitable for a jit other than
a) the ability to compile code that's in memory to a buffer that's in memory
b) the ability to supply the addresses of external symbols to the compiler
c) and the ability to retrieve the addresses of symbols that the compiler
generated.

Just the simplest things have been done.
There's no support for adding new code to a system (other than by making a
new, unrelated state and supplying addresses from the previous compilations
to it).

You have to make a new state for each time you call the compiler.  You
can't delete the old states or the code from them will be unusable.  You
can't reuse the symbol tables from them.

Every time you compile a few things from the runtime system are duplicated.
Bits of the runtime system are linked again, taking more memory, although
if you make no calls into the runtime system then some code won't be
linked, I'm told, but that probably means that very basic things would be
missing.

I've been told that at least you're using some run time from the enclosing
program - for instance the heap.  Thank God.

Obviously it has to load the headers and libraries that aren't part of
libtcc every compilation - luckily the compiler is super fast.

I've been told that the above flaws wouldn't be too hard to fix, but we'd
have to dig through the code and fix them ourselves.

I HAVE made a special version that keeps the include and library
directories embedded in the runtime so it doesn't have to read from the
disk to use those.  It works but it's not complete and it's not hosted
anywhere.  If I keep working on it, I'll fork the source on github.

My current version has decisions that I don't like and I'm busy remedying
them.  The current version starts with a zip file of the directories you
want embedded, turns that into source code that's a byte array, then
compiles that in.
And the runtime links in zlib and minizip to decompress that data when it's
included or linked. It all works but

1) it adds way too much complexity to a project that's supposed to cross
compatible - zlib and minizip

2) it slows down compilation by
a) having to decompress the data
b) requiring a critical section on the minizip/zlib code
c) allocating memory for each file as you use it and deallocating it when
done

I'll soon be done a new version without these problems.
1) it doesn't need zlib or minizip, instead it uses a simple program that I
wrote in C that can be part of the project, one that tcc can compile itself
too, of course.
2) the data is stored in memory uncompressed so that opening a file is
nothing more than finding a pointer to memory that's already there.  No
decompressing, no locks, no memory allocation and deallocation.

So I've already done some work to make libtcc more friendly - creating
internal assets instead of requiring a large directory structure on a
user's machine.

But all said, I'm beginning to think that tcc won't meet the requirements
of my project.
When I was testing a hash function for this new code, I noticed that
compiled under tcc, Spooky Hash runs 1/10th the speed as compiled under
gcc.  10 to 1 slowness is a much bigger factor than I anticipated.
On the other hand, it may be that this was an unusually optimizable loop
and tcc does much better than that on average.
One sign of that is that if I use tcc to compile tcc, then use the version
it created itself to do the process again, the compile time is still fast.
I haven't measured it doesn't seem much more than 2 times slower than a tcc
generated by the Microsoft compiler.

For my own project, while I could probably add the features I said TCC was
missing above, I doubt my ability to add an optimization phase to TCC.   So
if I stick with TCC at all, it will only be because I'm having fun adding
things to the TCC project, not because it will get me something I can stick
with.  For that, I can't find anything that has enough features other than
LLVM - and I don't think jit support in LLVM is mature.  It's being used,
but the component isn't stable and the new versions aren't compatible with
old ones, so the new one might not even be used anywhere yet.

Joshua Scholar

On Sun, Dec 27, 2020 at 1:23 AM fm663-subs--- via Tinycc-devel <
[email protected]> wrote:

> Hi TinyTCC developers
>
> Firstly I want to thank you guys for maintaining this amazing compiler
> that I have just discovered.
>
> My question is:
>
> When using libtcc as a JIT compiler, I would like to know if TCC already
> has some clever mechanism to reuse the results of - include libtcc.dll
> *.def and *.a.
>
> The reason I ask is because if I want to use TCC as JIT compiler for some
> dynamically generated C code repeatedly, I want to avoid unnecessary I/O
> and processing of the header files, static/dynamic libraries, etc.
> repeatedly.
>
> Can you please confirm whether there is some type of reuse/caching in the JIT
> compiler to handle this scenario? If yes, where in the code this is done?
> If not how to go about addressing this?
>
> Thank you
> Faisal
>
> _______________________________________________
> Tinycc-devel mailing list
> [email protected]
> https://lists.nongnu.org/mailman/listinfo/tinycc-devel
>

_______________________________________________
Tinycc-devel mailing list
[email protected]
https://lists.nongnu.org/mailman/listinfo/tinycc-devel

Re: [Tinycc-devel] JIT compiler efficiency

Reply via email to