Re: [Tinycc-devel] Minimizing libtcc memory use

2024-03-08 Thread Eric Raible
First off: If you are tired of this conversation, just tell me, I get it.


> > # text 32, data.rw  4, data.ro  4, bss
> 4 bytes
> > # memory usage: 8192 to run, 649 symbols, 2901 other, 1639290 max (bytes)
> > mem_cur_size=11742 (bytes)
>
> > So tcc_print_stats() says 11742, but then displays values totaling 11786.
>
> What 11786 ?
>
> 8192 + 649 + 2901 = 11742
>
I misinterpreted the output, and didn't realize that the 48 bytes
from text, data.rw, data.ro, and bss were already included.


> There is nothing wrong here:  11742 + 32 blocks * 96 = 14814
> Where 96 is the size of tcc's mem_debug extra header.
>
> If you want to see the 11742 from valgrind then you just need to
> run the same example with a normal tcc compiled without MEM_DEBUG.
>
> Which makes sense I would think.
>
100%


> But when showing the example with MEM_DEBUG and -bench -vv  I
> did not expect you to doubt the numbers in the first place.
>
I shouldn't have.  But once my allocator and valgrind agreed
I went down the wrong path, esp considering I was parsing the
output incorrectly.


> Rather I just was trying to show how you could get some numbers
> for your own real case instead.  Which as you suspect could be
> minimized from 29kB down to 1-2 kB.  Most likely impossible but
> if we had some numbers we could tell also why.
>

And showing me was helpful, I use it below to get some numbers.

I don't know enough about the internals of tcc to really even be having
this conversation.  But I do know that my interactive application can
have many TCCStates, and with:

./configure --extra-cflags="-DMEM_DEBUG"
with a: tcc_set_options(state, "-Werror -vv -bench");
a simple hello world (single-TCCState) example reports:
-
0: .text0x14b57000  len 001bc  align 1000
1: .data.ro 0x14b571c0  len 00030  align 0008
2: .data0x14b571f0  len 00078  align 0008
2: .bss 0x14b57268  len 00050  align 0008
2: .got 0x14b572b8  len 00030  align 0008
-
protect rwx 0x14b57000  len 01000
-

That looks great!

But tcc has actually allocated 21248, according to both my
(more sophisticated custom allocator) and valgrind
(for instance if I _don't_ delete the state).

So ~80% of the bytes are unaccounted for.
Scales exactly with the number of states, so it's not ideal.

My loaded C has a callback to register all of its functions
with the main application.  After that, I have no need for
tcc_get_symbol() support, or in fact anything from libtcc
except for tcc_delete().

So I'm trying to pre-delete() all of the unneeded stuff that
tcc_delete() will eventually free anyway.  I was calling that
tcc_finalize().  The goal is to reduce the minimal TCCState
size from ~21k to the 4K required for PROT_EXEC.

- Eric
___
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
https://lists.nongnu.org/mailman/listinfo/tinycc-devel


Re: [Tinycc-devel] Minimizing libtcc memory use

2024-03-08 Thread grischka via Tinycc-devel

On 08.03.2024 07:30, Eric Raible wrote:

I guess that I just want the numbers to add up.
Using your example:

1) -DMEM_DEBUG -DCONFIG_RUNMEM_RO=0
2) your test.c
3) but I added an early return to tcc_delete() to no-op it

Running: valgrind tcc -nostdlib -vv -bench -run test.c
produced:

tcc version 0.9.28rc 2024-03-03 mob@9d2068c6* (AArch64 Linux)
-> test.c
-
0: .text0x4ccb000  len 00020  align 1000
1: .data.ro 0x4ccb020  len 4  align 0008
2: .data0x4ccb028  len 4  align 0008
2: .bss 0x4ccb030  len 4  align 0008
-
protect rwx 0x4ccb000  len 01000
-
# 3030 idents, 4 lines, 92 bytes
# 0.463 s, 8 lines/s, 0.0 MB/s
# text 32, data.rw  4, data.ro  4, bss 4 bytes
# memory usage: 8192 to run, 649 symbols, 2901 other, 1639290 max (bytes)
mem_cur_size=11742 (bytes)



So tcc_print_stats() says 11742, but then displays values totaling 11786.


What 11786 ?

8192 + 649 + 2901 = 11742

> ==2188==
> ==2188== HEAP SUMMARY:
> ==2188== in use at exit: 14,814 bytes in 32 blocks
>
> And valgrind reports 14814.  I have never seen valgrind wrong about this,
> and especially so b.c. my (luckily-correct) allocator reported 14814 as well.

There is nothing wrong here:  11742 + 32 blocks * 96 = 14814
Where 96 is the size of tcc's mem_debug extra header.

If you want to see the 11742 from valgrind then you just need to
run the same example with a normal tcc compiled without MEM_DEBUG.

Which makes sense I would think.

But when showing the example with MEM_DEBUG and -bench -vv  I
did not expect you to doubt the numbers in the first place.

Rather I just was trying to show how you could get some numbers
for your own real case instead.  Which as you suspect could be
minimized from 29kB down to 1-2 kB.  Most likely impossible but
if we had some numbers we could tell also why.

-- gr


___
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
https://lists.nongnu.org/mailman/listinfo/tinycc-devel