Re: The size of ‘.go’ files

2020-06-24 Thread Andy Wingo
Hi :)

On Tue 09 Jun 2020 18:09, Ludovic Courtès  writes:

> Andy Wingo  skribis:
>
>> The guile.arities section starts with a sorted array of fixed-size
>> headers, then is followed by a sequence of ULEB128 references to local
>> variable names, including non-arguments.  The size is a bit perplexing,
>> I agree.  I can think of a number of ways to encode that section
>> differently but we'd need to understand a bit more about it and why the
>> baseline compiler is significantly different.
>
> ‘.guile.arities’ size should be proportional to the number of
> procedures, right?  Additionally, if there are only/mostly thunks, the
> string table for argument names should be small if not empty.  For N
> thunks, I would expect roughly N 28-byte headers + NxM UL128, say 100
> bytes per thunk; there’s 1000 of them, so we should be ~100,000 bytes.
> This is roughly what we get observe with the baseline compiler.

Yes but that doesn't mean that you can directly compare baseline to CPS
-- CPS has many more intermediate names than baseline for non-argument
locals, all of which end up getting entries in the arities section.

Andy



Re: The size of ‘.go’ files

2020-06-09 Thread Ludovic Courtès
Hello!

Andy Wingo  skribis:

> A few points of information :)

Much appreciated!

> The guile.arities section starts with a sorted array of fixed-size
> headers, then is followed by a sequence of ULEB128 references to local
> variable names, including non-arguments.  The size is a bit perplexing,
> I agree.  I can think of a number of ways to encode that section
> differently but we'd need to understand a bit more about it and why the
> baseline compiler is significantly different.

‘.guile.arities’ size should be proportional to the number of
procedures, right?  Additionally, if there are only/mostly thunks, the
string table for argument names should be small if not empty.  For N
thunks, I would expect roughly N 28-byte headers + NxM UL128, say 100
bytes per thunk; there’s 1000 of them, so we should be ~100,000 bytes.
This is roughly what we get observe with the baseline compiler.

>> “.rtl-text” is 38% smaller and “.guile.arities” is almost a tenth of
>> what it was.
>
> The difference in the text are the new baseline intrinsics,
> e.g. $vector-ref.  It goes in the opposite direction from instruction
> explosion, which sought to (1) make the JIT compiler easier by
> decomposing compound operations into their atomic parts, (2) make the
> optimizer learn more information from flow rather than type-checking
> side effects, and (3) allow the optimizer to eliminate / hoist / move
> the component pieces of macro-operations.
>
> However in the baseline compiler (2) and (3) aren't possible because
> there is no optimizer on that level, and therefore the result is
> actually a lose -- 10 micro-ops cost more than 1 macro-op because of
> stack traffic overhead, which isn't currently mitigated by the JIT (1).
>
> So instruction explosion is residual code explosion, which should pay
> off in theory, but not for the baseline compiler.  So I added new
> intrinsics for e.g. $vector-ref et al.  Thus the smaller code size.

Yes, that makes a lot of sense.  In particular, this file must use the
struct intrinsics a lot.

> There are more possibilities for making code size smaller, e.g. having
> two equivalent encodings for bytecode, where one is smaller:
>
>   https://webkit.org/blog/9329/a-new-bytecode-format-for-javascriptcore/

Like THUMB, but for bytecode.  :-)

I guess we could first analyze the generated code more closely and see
if there are opportunities there.

Thanks for the explanations!

Ludo’.



Re: The size of ‘.go’ files

2020-06-08 Thread Andy Wingo
Hi :)

A few points of information :)

On Fri 05 Jun 2020 22:50, Ludovic Courtès  writes:

> [Sorting] the ELF sections of a .go file by size; for ‘python-xyz.go’,
> I get this:
>
> $13 = ((".rtl-text" . 3417108)
>  (".guile.arities" . 1358536)
>  (".data" . 586912)
>  (".rodata" . 361599)
>  (".symtab" . 117000)
>  (".debug_line" . 97342)
>  (".debug_info" . 54519)
>  (".guile.frame-maps" . 47114)
>  ("" . 1344)
>  (".guile.arities.strtab" . 681)
>  ("" . 232)
>  (".shstrtab" . 229)
>  (".dynamic" . 112)
>  (".debug_str" . 87)
>  (".strtab" . 75)
>  (".debug_abbrev" . 65)
>  (".guile.docstrs.strtab" . 1)
>  ("" . 0)
>  (".guile.procprops" . 0)
>  (".guile.docstrs" . 0)
>  (".debug_loc" . 0))
>
> More than half of those 6 MiB is code, and more than 1 MiB is
> “.guile.arities” (info "(guile) Object File Format"), which is
> surprisingly large; presumably the file only contains thunks (the
> ‘thunked’ fields of ).

The guile.arities section starts with a sorted array of fixed-size
headers, then is followed by a sequence of ULEB128 references to local
variable names, including non-arguments.  The size is a bit perplexing,
I agree.  I can think of a number of ways to encode that section
differently but we'd need to understand a bit more about it and why the
baseline compiler is significantly different.

> Stripping the .debug_* sections (if that works) clearly wouldn’t help.

I believe that it should eventually be possible to strip guile.arities,
fwiw.

> So I guess we could generate less code (reduce ‘.rtl-text’), perhaps by
> tweaking ‘define-record-type*’, but I have little hope there.

Hehe :)  As you mention later:

> With 3.0.3-to-be and -O1, python-xyz.go weighs in at 3.4 MiB instead of
> 5.9 MiB!  Here’s the section size distribution:
>
> $4 = ((".rtl-text" . 2101168)
>  (".data" . 586392)
>  (".rodata" . 360703)
>  (".guile.arities" . 193106)
>  (".symtab" . 117000)
>  (".debug_line" . 76685)
>  (".debug_info" . 53513)
>  ("" . 1280)
>  (".guile.arities.strtab" . 517)
>  ("" . 232)
>  (".shstrtab" . 211)
>  (".dynamic" . 96)
>  (".debug_str" . 87)
>  (".strtab" . 75)
>  (".debug_abbrev" . 56)
>  (".guile.docstrs.strtab" . 1)
>  ("" . 0)
>  (".guile.procprops" . 0)
>  (".guile.docstrs" . 0)
>  (".debug_loc" . 0))
> scheme@(guile-user)> (stat:size (stat go))
> $5 = 3519323
>
> “.rtl-text” is 38% smaller and “.guile.arities” is almost a tenth of
> what it was.

The difference in the text are the new baseline intrinsics,
e.g. $vector-ref.  It goes in the opposite direction from instruction
explosion, which sought to (1) make the JIT compiler easier by
decomposing compound operations into their atomic parts, (2) make the
optimizer learn more information from flow rather than type-checking
side effects, and (3) allow the optimizer to eliminate / hoist / move
the component pieces of macro-operations.

However in the baseline compiler (2) and (3) aren't possible because
there is no optimizer on that level, and therefore the result is
actually a lose -- 10 micro-ops cost more than 1 macro-op because of
stack traffic overhead, which isn't currently mitigated by the JIT (1).

So instruction explosion is residual code explosion, which should pay
off in theory, but not for the baseline compiler.  So I added new
intrinsics for e.g. $vector-ref et al.  Thus the smaller code size.

I am not sure what causes the significantly different .guile.arities
size!

> Something’s going on here!  Thoughts?

There are more possibilities for making code size smaller, e.g. having
two equivalent encodings for bytecode, where one is smaller:

  https://webkit.org/blog/9329/a-new-bytecode-format-for-javascriptcore/

Or it could be that if we could do register allocation for a
target-dependent fixed set of registers in bytecode already, that could
decrease minimum instruction size, making more instructions fit into
single 32-bit words.  Would be nice if the JIT could rely on the
bytecode compiler to already have done register allocation, and reify
corresponding debug information.  Just a thought though, and not really
appropriate to the baseline compiler.

Cheers,

Andy



Re: The size of ‘.go’ files

2020-06-07 Thread Pierre Neidhardt
Same here! :)

-- 
Pierre Neidhardt
https://ambrevar.xyz/


signature.asc
Description: PGP signature


Re: The size of ‘.go’ files

2020-06-06 Thread Katherine Cox-Buday
Mathieu Othacehe  writes:

> Having a lighter disk-image isn't very important on desktop, but for the
> embedded devices with small eMMC, any improvement would be really
> welcome :)

I was recently discussing a Guix buildbox container image I put together
with a coworker, and one of the tremendous downsides was the size of the
image. I think smaller images are important all around! I'm looking
forward to seeing your work.

-- 
Katherine



Re: The size of ‘.go’ files

2020-06-06 Thread Mathieu Othacehe


Hey Ludo,

> $ guix size $(readlink -f /run/current-system) | head -5
> store item   totalself
> /gnu/store/4d0p06xgaw8lqa9db0d6728kkba8bizj-qemu-5.0.01651.6   
> 745.2  18.8%
> /gnu/store/abiva5ivq99x30r2s9pa3jj0pv9g16sv-guix-1.1.0-4.bdc801e   468.0   
> 268.8   6.8%
> /gnu/store/111zp1qyind7hsnvrm5830jhankmx4ls-linux-libre-5.4.43 243.6   
> 243.6   6.2%
> /gnu/store/skxkrhgn9z0fg9hmnbcyfdgzs5w4ryrr-llvm-9.0.1 199.9   
> 128.5   3.2%

When building a bare-bones Guix System disk-image, "Guix", "Guile" and
"Guile-static" represent 331M of .go files, see:

--8<---cut here---start->8---
find  /gnu/store/fvvpmrgnvr9jqxfn5m956xblisa8vzr4-guix-1.1.0-4.bdc801e
/gnu/store/ljcrz0d86r20phszvj6s1mdyjchz79ja-guile-static-stripped-3.0.2
/gnu/store/18hp7flyb3yid3yp49i6qcdq0sbi5l1n-guile-3.0.2-name "*.go"
-print0| du --files0-from=- -hc |tail -n1
--8<---cut here---end--->8---

If we compare it to the 943M of the "reduced" image I'm working on, it
makes around 1/3 of the final image.

> With 3.0.3-to-be and -O1, python-xyz.go weighs in at 3.4 MiB instead of
> 5.9 MiB!  Here’s the section size distribution:

Wooh, interesting!

Having a lighter disk-image isn't very important on desktop, but for the
embedded devices with small eMMC, any improvement would be really
welcome :)

Thanks,

Mathieu