[racket-users] `raco test` seems slower, not caching any compilation results?

2020-03-18 Thread Ryan Kramer
I'm currently using Racket 7.6 (non-CS) on Windows, and it feels like `raco 
test` is much slower these days. Specifically I am seeing

`raco test tests.rkt` takes 10 seconds. This is fine the first time, but 
repeating the test immediately also takes 10 seconds. It never creates any 
"compiled" directories.

Meanwhile, if I open tests.rkt in DrRacket, the first run takes about 10 
seconds but then subsequent runs (without changes) take about 1 second. 
This creates a "compiled" directory, which I thought raco test would be 
able to use, but no luck -- raco test still takes 10 seconds.

I think this is a regression from 7.5, or else something strange is 
happening on my machine. I will try reverting to 7.5 soon to see if I can 
definitely determine the difference.

Does anyone know why this might be occurring?

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/racket-users/b7c7b864-3208-4a98-bfc2-84b9c3a1a313%40googlegroups.com.


[racket-users] using plai/gc2/{collector,mutator} in one file

2020-03-18 Thread David Bremner

As part of an effort to use plai/gc2 with the racket handin server
(never having really successfully used the multiple file stuff), I've
been trying (and failing) to use modules in one file for the collector
and mutator.

Attached is the simplest possible example I could cook up using sample
code from the web. In racket 7.6 I get a complaint about 'submod: not a
require sub-form ; in: (submod ".." null-gc)'. That doesn't seem right
to me, but I guess maybe allocator-setup was never tested with a
submodule path. Any workaround/correction is welcome.

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/racket-users/87a74ddvyp.fsf%40tethera.net.


test.rkt
Description: Binary data


Re: [racket-users] Re: Code generation performance

2020-03-18 Thread Eric Griffis
Made a tiny bit of progress today.

On a bigger machine, I was able to profile the giant unit tests module. It
has one top-level for/template that iterates over the 5 scalar types, and a
bunch of smaller ones inside that cover the multitude of operations for
each of the 4 fixed vector lengths.

profiling (lib "glm/vector/tests.rkt")
> Initial code size: 5039
> Final code size  : 1019095
>

The good news is, I'm seeing around 200x compression. I mean, who wouldn't
mind getting completely DRY source for as little as 1/250th the effort?
(Assuming, of course, that programming time is proportional to program size
in bytes.)

The bad news is, compilation takes around 40 seconds on a modern desktop
with plenty of CPU and RAM. From the rest of the profiling output, it looks
like phase-0 for/template is responsible for about 12.5% of the total size,
but phase-1 for/list contributes 57.2% and phase-0 check contributes 48.4%.

I'm not sure how to interpret these numbers yet. On one hand, for/template
is essentially a for/list loop unroller, so the stats could just mean it
did its job. On the other hand, I don't know how much of that 57.2% is
merely the cost of doing business in Racket.

When I comment out everything but the first two tests, I see this:

Initial code size: 243
> Final code size  : 21725
>

That's a mere 89x compression, which is OK because the first two tests are
relatively simple, with phase-0 for/template accounting for 58.5% of the
total size, phase-1 for/list contributing 23.2%, and no phase-0 check.

It's starting to look like there isn't much I can do to bring down the
total size. But what about total compile time?

When I manually unroll the for/template forms, the profiler gives:

Initial code size: 1509
> Final code size  : 21725
>

The identical final size is interesting -- it suggests the original output
sizes are what they would be if templates weren't used.

This version takes, on average, 1.883 seconds to compile. The for/template
version takes 2.499 seconds, and an empty test suite takes 1.743 seconds.
Subtracting out the control time, it took 0.612 seconds more, or 5.4x
longer, to compile a fairly simple module with for/template than without.

Is the extra cost acceptable? I'm guessing that's highly context dependent.
In this case, adding half a second to compile one module wouldn't
inconvenience me terribly, but it doesn't take much imagination to find a
situation where it would, and I have no idea how any of these numbers will
scale.

Eric


On Sat, Mar 14, 2020 at 3:28 PM Eric Griffis  wrote:

> Alright, I re-discovered Ryan Culpepper's talk, "The Cost of Sugar," from
> the RacketCon 2018 video stream (https://youtu.be/CLjXhr_TgP8?t=5908) and
> made some progress by following along.
>
> Here are the .zo files larger than 100K:
>
> 993K ./vector/compiled/tests_rkt.zo
> 830K ./scribblings/compiled/glm_scrbl.zo
> 328K ./vector/compiled/relational_rkt.zo
> 295K ./vec4/compiled/bool_rkt.zo
> 291K ./vec4/compiled/int_rkt.zo
> 290K ./vec4/compiled/uint_rkt.zo
> 290K ./vec4/compiled/double_rkt.zo
> 289K ./vec4/compiled/float_rkt.zo
> 280K ./vec3/compiled/bool_rkt.zo
> 276K ./vec3/compiled/int_rkt.zo
> 275K ./vec3/compiled/uint_rkt.zo
> 275K ./vec3/compiled/double_rkt.zo
> 274K ./vec3/compiled/float_rkt.zo
> 262K ./vec2/compiled/bool_rkt.zo
> 258K ./vec2/compiled/uint_rkt.zo
> 258K ./vec2/compiled/int_rkt.zo
> 258K ./vec2/compiled/double_rkt.zo
> 257K ./vec2/compiled/float_rkt.zo
> 213K ./vec1/compiled/bool_rkt.zo
> 210K ./vec1/compiled/uint_rkt.zo
> 210K ./vec1/compiled/int_rkt.zo
> 210K ./vec1/compiled/double_rkt.zo
> 209K ./vec1/compiled/float_rkt.zo
> 102K ./compiled/main_rkt.zo
> 101K ./compiled/vector_rkt.zo
>
> I'm pretty sure that's a lot of big files. It's for a port of GLM, a
> graphics math library that implements (among other things) fixed-length
> vectors of up to 4 components over 5 distinct scalar types, for a total of
> 20 distinct type-length combinations with many small variations in their
> APIs and implementations.
>
> The variations I'm targeting either require a macro or exacerbate
> developer- or run-time overhead when functions are introduced. For example,
> the base component accessors for a four-component vector of doubles are:
>
>   dvec4-x
>   dvec4-y
>   dvec4-z
>   dvec4-w
>
> Each of the "xyzw" components has two aliases -- one from "rgba" and
> another from "stpq". Each accessor also has a corresponding mutator, e.g.,
> dvec4-g and set-dvec4-g!.
>
> For another example, whereas adding two dvec4's sums four components,
>
>   (dvec4
>(fl+ (dvec4-x v1) (dvec4-x v2))
>(fl+ (dvec4-x v1) (dvec4-x v2))
>(fl+ (dvec4-x v1) (dvec4-x v2))
>(fl+ (dvec4-x v1) (dvec4-x v2)))
>
> the same operation on dvec2's sums only the first two components.
>
> Furthermore, the sheer volume of the target code base makes writing
> everything out by hand a mind-numbing exercise in frustration, and that's
> when looking at a mere 20% of the pile. It's going to get much