Re: [racket-users] Compilation/Embedding leaves syntax traces

2018-09-28 Thread 'Paulo Matos' via Racket Users



On 26/09/2018 19:39, Philip McGrath wrote:
> On Wed, Sep 26, 2018 at 1:36 AM Paulo Matos  wrote:
> 
> I am keen on hearing about alternatives. The reason to do like this is
> to minimize friction with clients. Clients in the area of development
> tools expect something that they can execute and generally are not too
> keen on scripty calls like `python foo.py`, so if I said: Please run the
> program with `racket s10.rkt` ... it would very quickly lower my
> possibility of a sale. Racket distribution creates essentially the
> package that they expect, something they can run out of the box without
> thinking much about dependencies in something that looks like an
> executable - even if it's just more or less a racket shell. Your product
> appearance is important and therefore I want to give something they are
> already used to.
> 
> 
> I definitely understand wanting users to be able to run the application
> in a way that feels as normal as possible to them: I think about this
> even with internal tools that I develop for collaborators with a limited
> technical background.
> 
> I think the approach I brought up would be compatible with making
> distributions. What I had in mind (but see below) was something like:
> ;; x86_64-no-contracts.rkt
> #lang racket/base
> (require "../run-program.rkt")
> (run-program #:arch 'x86_64 #:contracts? #f)
> 
> ;; i386-with-contracts.rkt
> #lang racket/base
> (require "../run-program.rkt")
> (run-program #:arch 'i386 #:contracts? #f)
>  Then `raco exe`/`create-embedding-executable`/whatever can work on
> either version.
> 
> > Most seriously, depending on exactly how you use these compile-time
> > environment variables, it seems like you could negate some of the
> > benefits of the "separate compilation guarantee," particularly if you
> > are assuming that all of your modules are compiled at the same time.
> >
> 
> Why would that be a problem?
> 
> 
> This is a longer discussion and I am by no means an expert, but I can
> point you to the "separate compilation guarantee" docs
> (http://docs.racket-lang.org/reference/eval-model.html#(part._separate-compilation))
> and Matthew Flatt's paper "Composable and Compilable Macros: You Want it
> /When?/" (https://www.cs.utah.edu/plt/publications/macromod.pdf).
> 

I took the time to think through this and skim through the paper. I
understand what you mean by separate compilation guarantee. I am not
convinced what I am doing breaks the separate compilation guarantee.
>From what I understood this is broken if there are any side-effects of
the compilation process, i.e. printing, writing to files, etc. Which
there are none with the approach I take. However, I might be missing
some subtle detail. I haven't yet internalised all of the information
with regards to phase levels.

> I don't think what you are doing circumvents the separate compilation
> guarantee per se, 

+1 ah, should have read this before writing the above. :)

> because you don't produce "external effects" (e.g.
> I/O) during compilation and then rely on those side-effects during
> run-time. But, while I have not thought especially deeply about this,
> using environment variables this way seems to be sort of the mirror
> image: the state of the universe external to the Racket runtime system
> has effects on the compilation of your modules, and it seems like that
> might introduce similar problems.
> 

This might be one of those subtleties I don't quite understand and they
might happen. As I said in the OP, I am actually still having problems I
can't reproduce in a small example, that only occur when the code is
embedded in the executable.

> In particular, "the practical consequence of [the separate compilation]
> guarantee is that because effects are never visible, no module can
> detect whether a module it requires is already compiled. Thus, it can
> never change the compilation of one module to have already compiled a
> different module." This has all kinds of nice benefits that are detailed
> in the paper.
> 

This might be related to what I am seeing... which looks like this:
instantiate-linklet: mismatch;
 reference to a variable that is uninitialized;
 possibly, bytecode file needs re-compile because dependencies changed
  name: g8759.1
  exporting instance: '#%embedded:g28621:stochastic-statistics
  importing instance: '#%embedded:g26566:stochastic


On the other hand, it's strange that this only happens when the code is
embedded in the executable... so I am not sure if it's actually this or
just a bug in my phase 1 code in the embedding process.

> In your case you seem to assume that all of the modules are compiled at
> the same time (i.e. with the same set of environment variables). 

Correct.

> You do
> seem to put in the effort to actually uphold that assumption, and maybe
> there are compelling reasons to do so in your case, but I would suggest
> thinking carefully about that decision.

Re: [racket-users] Compilation/Embedding leaves syntax traces

2018-09-26 Thread Philip McGrath
On Wed, Sep 26, 2018 at 1:36 AM Paulo Matos  wrote:

> I am keen on hearing about alternatives. The reason to do like this is
> to minimize friction with clients. Clients in the area of development
> tools expect something that they can execute and generally are not too
> keen on scripty calls like `python foo.py`, so if I said: Please run the
> program with `racket s10.rkt` ... it would very quickly lower my
> possibility of a sale. Racket distribution creates essentially the
> package that they expect, something they can run out of the box without
> thinking much about dependencies in something that looks like an
> executable - even if it's just more or less a racket shell. Your product
> appearance is important and therefore I want to give something they are
> already used to.
>

I definitely understand wanting users to be able to run the application in
a way that feels as normal as possible to them: I think about this even
with internal tools that I develop for collaborators with a limited
technical background.

I think the approach I brought up would be compatible with making
distributions. What I had in mind (but see below) was something like:
;; x86_64-no-contracts.rkt
#lang racket/base
(require "../run-program.rkt")
(run-program #:arch 'x86_64 #:contracts? #f)

;; i386-with-contracts.rkt
#lang racket/base
(require "../run-program.rkt")
(run-program #:arch 'i386 #:contracts? #f)
 Then `raco exe`/`create-embedding-executable`/whatever can work on either
version.

> Most seriously, depending on exactly how you use these compile-time
> > environment variables, it seems like you could negate some of the
> > benefits of the "separate compilation guarantee," particularly if you
> > are assuming that all of your modules are compiled at the same time.
> >
>
> Why would that be a problem?
>

This is a longer discussion and I am by no means an expert, but I can point
you to the "separate compilation guarantee" docs (
http://docs.racket-lang.org/reference/eval-model.html#(part._separate-compilation))
and Matthew Flatt's paper "Composable and Compilable Macros: You Want it
*When?*" (https://www.cs.utah.edu/plt/publications/macromod.pdf).

I don't think what you are doing circumvents the separate compilation
guarantee per se, because you don't produce "external effects" (e.g. I/O)
during compilation and then rely on those side-effects during run-time.
But, while I have not thought especially deeply about this, using
environment variables this way seems to be sort of the mirror image: the
state of the universe external to the Racket runtime system has effects on
the compilation of your modules, and it seems like that might introduce
similar problems.

In particular, "the practical consequence of [the separate compilation]
guarantee is that because effects are never visible, no module can detect
whether a module it requires is already compiled. Thus, it can never change
the compilation of one module to have already compiled a different module."
This has all kinds of nice benefits that are detailed in the paper.

In your case you seem to assume that all of the modules are compiled at the
same time (i.e. with the same set of environment variables). You do seem to
put in the effort to actually uphold that assumption, and maybe there are
compelling reasons to do so in your case, but I would suggest thinking
carefully about that decision.


> That is a possibility but I run into the problem of having too many
> modules as the number of possibly configurations can easily explode as I
> add more configurations and backends to my application. Getting one file
> per compile-time configuration is essentially not workable.
>
> My CI script currently creates 48 configurations for testing. That's 4
> boolean options plus 3 backend architectures that I either completely
> support or are under development. If I get a couple of customers wanting
> different architectures, I can easily go to maybe 5 or 6 backends and
> have more than 100 configurations to test. I would probably have to stop
> testing _every_ config at some point and choose which configs are more
> important but this seems to be the problem with this kind of choice - in
> GCC world we have the same problem and ended up having to create several
> tiers of backends/configurations to be able to do a proper job at
> testing and releasing.
>
> If I am missing some fundamental way Racket can help with this, I am
> open to other options but it all boils down to moving as many things to
> compile time config so I can drop unnecessary code and try to make this
> as fast as possible.
>

Yes, I see the issue here. I don't have a fully worked-out solution and
probably can't without knowing your requirements in detail, but I have a
few general thoughts.

How much of this configuration really needs to happen at compile-time vs.
runtime? In the examples you've sent, you effectively turn
"S10_ENABLE_CONTRACTS" and "ARCH" into runtime constants. If it's just a
matter that you 

Re: [racket-users] Compilation/Embedding leaves syntax traces

2018-09-26 Thread 'Paulo Matos' via Racket Users



On 25/09/2018 23:38, Ryan Culpepper wrote:
> On 9/25/18 1:11 PM, Alexis King wrote:
>> [] Personally, I would appreciate a way to ask
>> Racket to strip all phase ≥1 code and phase ≥1 dependencies from a
>> specified program so that I can distribute the phase 0 code and
>> dependencies exclusively. However, to my knowledge, Racket does not
>> currently include any such feature.
> 
> `raco demod -g` seems to do that, but the `-g` option is marked with a
> warning.
> 
> Ryan
> 

Once you have this new demod'ed zo file, how do you run it? On the
example of my original post, I get the demod'ed file, and try to run it
with racket. It doesn't crash but it also prints nothing.

-- 
Paulo Matos

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] Compilation/Embedding leaves syntax traces

2018-09-26 Thread 'Paulo Matos' via Racket Users



On 25/09/2018 23:44, Philip McGrath wrote:
> On Tue, Sep 25, 2018 at 3:46 PM 'Paulo Matos' via Racket Users
> mailto:racket-users@googlegroups.com>>
> wrote:
> 
> OK, so I understand now that what I want is an unimplemented feature,
> but in most compilers these days and certainly those based in LLVM and
> GCC there's a feature called whole-program optimization or link time
> optimization. Basically the compiler will get the whole program in
> memory after compiling each module/file and run the optimizations on the
> whole thing again therefore being able to optimize things it wasn't able
> to optimize before when it only had a module/file view.
> 
> 
> I know that `raco demodularize` can flatten a whole program
> (https://docs.racket-lang.org/raco/demod.html). I think I remember
> reading a paper that looked at using this as the basis for whole-program
> optimization, but I don't remember the results of the paper, much less
> anything about doing this in practice.
>  

You are right, and as Ryan mentioned in another post -g will re-compile
after the de-modularization - in effect doing whole-program
optimization. I am amazed and scared at the same time. :)

If you recall the paper please let me know.

> 
> >> My software has several compile time options that use environment
> >> variables to be read (since I can't think of another way to do
> it) so I
> >> define a compile time variable as:
> >>
> >> (define-for-syntax enable-contracts?
> >> (and (getenv "S10_ENABLE_CONTRACTS") #true))
> >>
> >> And then I create a macro to move this compile-time variable to
> runtime:
> >> (define-syntax (compiled-with-contracts? stx)
> >> (datum->syntax stx enable-contracts?))
> >>
> >> I have a few of these so when I create a distribution, I first
> create an
> >> executable with (I use create-embedding-executable but for
> simplicity,
> >> lets say I am using raco):
> >> S10_ENABLE_CONTRACTS=1 raco exe ...
> 
> 
> More broadly, the thing that first struck me is that most Racket
> programs don't seem to use this sort of build process. I do think they
> should be /able/ to, and there may be good reasons to do that in your
> case, but I've been trying to think about what it is about this that
> strikes me as un-Racket-ey and what alternatives might be.
> 

I am keen on hearing about alternatives. The reason to do like this is
to minimize friction with clients. Clients in the area of development
tools expect something that they can execute and generally are not too
keen on scripty calls like `python foo.py`, so if I said: Please run the
program with `racket s10.rkt` ... it would very quickly lower my
possibility of a sale. Racket distribution creates essentially the
package that they expect, something they can run out of the box without
thinking much about dependencies in something that looks like an
executable - even if it's just more or less a racket shell. Your product
appearance is important and therefore I want to give something they are
already used to.

> I think one part of it is that compiling differently based on
> environment variables seems to go against the principle that "Racket
> internalizes extra-linguistic mechanisms"
> (http://felleisen.org/matthias/manifesto/sec_intern.html). For a
> practical example, environment variables are vexing on Mac OS for GUI
> programs.

Not sure about Mac OS because I never used one but I hadn't had the need
before this project to do something like this and when I started I was
pointed to how contracts are doing this:
(define-for-syntax enable-contracts? (and (getenv "PLT_TR_CONTRACTS") #t))

in
https://github.com/racket/typed-racket/blob/1825355c4879b6263b0c8fe88b30e11d79fc0d31/typed-racket-lib/typed-racket/utils/utils.rkt#L43


So, I created a racket app to pack my application, so you can actually
do something like:
racket create-s10-distribution.rkt --with-contracts --with-statistics
./release-folder

This will put an S10 release in release folder that can be moved to
another system.

Side note: This is also useful for benchmarking as I can easily
dockerize it, and get it running on a big AWS machine without worrying
about installing racket, etc.

> 
> Another issue is that this approach means that only one compiled
> configuration exists at a time. In some cases maybe that's right: I've
> had files in which I manually switch the definition of a macro to, say,
> add some `printf`s that I only want when I'm actively working on that
> specific file. But more often, if it makes sense for multiple versions
> of a program to exist—say, your example of different architectures—I
> think it also makes sense for them to be able to exist simultaneously.
> 

At the moment I have no issues with this. I usually run it in debug mode
with say: --with-contracts --with-places-debug --with-statistics
--with-timing etc.

Then when I push a change, CI will compile quite a few different

Re: [racket-users] Compilation/Embedding leaves syntax traces

2018-09-26 Thread 'Paulo Matos' via Racket Users



On 25/09/2018 23:38, Ryan Culpepper wrote:
> On 9/25/18 1:11 PM, Alexis King wrote:
>> [] Personally, I would appreciate a way to ask
>> Racket to strip all phase ≥1 code and phase ≥1 dependencies from a
>> specified program so that I can distribute the phase 0 code and
>> dependencies exclusively. However, to my knowledge, Racket does not
>> currently include any such feature.
> 
> `raco demod -g` seems to do that, but the `-g` option is marked with a
> warning.

Wow, ok, I am officially amazed - but at the same time slightly scared
of trying it on a very large application. :)


What do you mean 'marked with a warning'?
raco demod --help just says:
-r, --recompile : Recompile final module to re-run optimizations

There's no warning there.

-- 
Paulo Matos

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] Compilation/Embedding leaves syntax traces

2018-09-25 Thread Philip McGrath
On Tue, Sep 25, 2018 at 3:46 PM 'Paulo Matos' via Racket Users <
racket-users@googlegroups.com> wrote:

> OK, so I understand now that what I want is an unimplemented feature,
> but in most compilers these days and certainly those based in LLVM and
> GCC there's a feature called whole-program optimization or link time
> optimization. Basically the compiler will get the whole program in
> memory after compiling each module/file and run the optimizations on the
> whole thing again therefore being able to optimize things it wasn't able
> to optimize before when it only had a module/file view.
>

I know that `raco demodularize` can flatten a whole program (
https://docs.racket-lang.org/raco/demod.html). I think I remember reading a
paper that looked at using this as the basis for whole-program
optimization, but I don't remember the results of the paper, much less
anything about doing this in practice.


> >> My software has several compile time options that use environment
> >> variables to be read (since I can't think of another way to do it) so I
> >> define a compile time variable as:
> >>
> >> (define-for-syntax enable-contracts?
> >> (and (getenv "S10_ENABLE_CONTRACTS") #true))
> >>
> >> And then I create a macro to move this compile-time variable to runtime:
> >> (define-syntax (compiled-with-contracts? stx)
> >> (datum->syntax stx enable-contracts?))
> >>
> >> I have a few of these so when I create a distribution, I first create an
> >> executable with (I use create-embedding-executable but for simplicity,
> >> lets say I am using raco):
> >> S10_ENABLE_CONTRACTS=1 raco exe ...
>

More broadly, the thing that first struck me is that most Racket programs
don't seem to use this sort of build process. I do think they should be
*able* to, and there may be good reasons to do that in your case, but I've
been trying to think about what it is about this that strikes me as
un-Racket-ey and what alternatives might be.

I think one part of it is that compiling differently based on environment
variables seems to go against the principle that "Racket internalizes
extra-linguistic mechanisms" (
http://felleisen.org/matthias/manifesto/sec_intern.html). For a practical
example, environment variables are vexing on Mac OS for GUI programs.

Another issue is that this approach means that only one compiled
configuration exists at a time. In some cases maybe that's right: I've had
files in which I manually switch the definition of a macro to, say, add
some `printf`s that I only want when I'm actively working on that specific
file. But more often, if it makes sense for multiple versions of a program
to exist—say, your example of different architectures—I think it also makes
sense for them to be able to exist simultaneously.

Most seriously, depending on exactly how you use these compile-time
environment variables, it seems like you could negate some of the benefits
of the "separate compilation guarantee," particularly if you are assuming
that all of your modules are compiled at the same time.

In terms of an alternative, I will pass on an approach suggested to me by
Jack Firth, Tony Garnock-Jones, and others on this list (
https://groups.google.com/d/msg/racket-users/ftr4GDy7LG0/Qcm2LaNQDAAJ):
instead of having one "main module" that is your program, define each
variant of your program as a module that declares its configuration. I have
been doing this myself, and I'm very happy with the approach. In my case
the configuration consists of using `parameterize` at runtime, but I expect
you could find a way to express whatever you need to.

-Philip


> >>
> >> I have a bunch of other options that don't matter for the moment.
> >>
> >> One of the things I noticed is that in some cases when I run my
> >> executable, compile time code living inside begin-for-syntax to check if
> >> a variable has been defined during compilation or not is triggered. At a
> >> point, which I didn't expect any more syntax expansion to occur.
> >>
> >> I can't really reproduce the issue with a small example yet but I
> >> noticed something:
> >>
> >> main.rkt:
> >>
> >> #lang racket
> >>
> >> (require (file "arch-choice.rkt"))
> >>
> >> (module+ main
> >> (printf "arch: ~a~n" (get-path)))
> >>
> >> arch-choice.rkt:
> >>
> >> #lang racket
> >>
> >> (provide get-path)
> >>
> >> (begin-for-syntax
> >>
> >> (define arch-path (getenv "ARCH"))
> >>
> >> (unless arch-path
> >>   (raise-user-error 'driver "Please define ARCH with a suitable path")))
> >>
> >> (define-syntax (get-path stx)
> >> (datum->syntax stx arch-path))
> >>
> >> Then just to make sure nothing is compiled I remove my zos:
> >> $ find . -type f -name '*.zo' -exec \{\} \;
> >>
> >> Then compile it:
> >> $ ARCH=foo raco exe main.rkt
> >>
> >> In this case if you run ./main you'll get 'arch: foo' back which is fine
> >> so I can't reproduce what I see in my software which is with some
> >> combinations of compile time options, I see:
> >> 'driver: Please define ARCH environment variable'
> >>

Re: [racket-users] Compilation/Embedding leaves syntax traces

2018-09-25 Thread Ryan Culpepper

On 9/25/18 1:11 PM, Alexis King wrote:

[] Personally, I would appreciate a way to ask
Racket to strip all phase ≥1 code and phase ≥1 dependencies from a
specified program so that I can distribute the phase 0 code and
dependencies exclusively. However, to my knowledge, Racket does not
currently include any such feature.


`raco demod -g` seems to do that, but the `-g` option is marked with a 
warning.


Ryan

--
You received this message because you are subscribed to the Google Groups "Racket 
Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] Compilation/Embedding leaves syntax traces

2018-09-25 Thread 'Paulo Matos' via Racket Users



On 25/09/2018 20:11, Alexis King wrote:
> (Sorry, Paulo, for the duplicate message; I forgot to Reply All the
> first time.)
> 
> This is sort of subtle. When we consider a macro-enabled language, we
> often imagine that `expand` takes a program with some phase ≥1 code,
> expands all the macros in the program by running the phase ≥1 code, and
> produces a fully-expanded program with only phase 0 code left. There is
> some truth to this, but it doesn’t paint the whole picture.
> 
> [snip] [snip]

Alexis, thanks for the thorough reply. I understood everything, at least
up until this point.


> The above explains why Racket retains some phase ≥1 code. However, it
> may be unsatisfying: while it’s true that the phase ≥1 code might be
> necessary for compilation of other modules, once you have compiled your
> whole program, it shouldn’t be necessary to keep that information
> around, right? 

Right! That's exactly what I was thinking...

> No other modules will ever need to be compiled against
> the macro-providing module. However, this is not necessarily true!
> Racket provides a set of reflective operations for compiling modules at
> runtime, and it makes no assumptions that all modules will be loaded
> from compiled code. In this sense, Racket includes an “open-world
> assumption” when compiling modules, and it retains any phase ≥1 code
> necessary for compiling new modules at any time.
> 

OK, so I understand now that what I want is an unimplemented feature,
but in most compilers these days and certainly those based in LLVM and
GCC there's a feature called whole-program optimization or link time
optimization. Basically the compiler will get the whole program in
memory after compiling each module/file and run the optimizations on the
whole thing again therefore being able to optimize things it wasn't able
to optimize before when it only had a module/file view.

Now, in Racket when I compile an executable, although it's true there
might be dynamic-requires, if you look at the example I posted there's
not even one. Surely it's possible to remove all the phase>=1 code,
correct? Is it just the case that this kind of global optimization is
not yet implemented?

Even with dynamic-requires, if the dynamic-require depends on a compile
time variable that contains the path, after compilation the
dynamic-require won't change and will always require the same file,
therefore we can do the same kind of phase >= 1 code cleanup.

Am I missing any subtlety here or are these feasible but we are just
missing these optimizations?

> This sort of thing is necessary to implement tools like DrRacket, which
> frequently compile new modules at runtime, but admittedly, most programs
> don’t do any such thing. Personally, I would appreciate a way to ask
> Racket to strip all phase ≥1 code and phase ≥1 dependencies from a
> specified program so that I can distribute the phase 0 code and
> dependencies exclusively. However, to my knowledge, Racket does not
> currently include any such feature.
> 

Again, here I assume that in some cases, like the ones I mentioned above
you wouldn't even have to ask. It could be done automatically.

> For more information on declaring, instantiating, and visiting modules,
> and how that relates to compilation, see this very helpful section in
> The Racket Guide:
> 
>http://docs.racket-lang.org/guide/macro-module.html
> 

Thank you for the reference.

Paulo Matos
> 
>> On Sep 25, 2018, at 07:32, 'Paulo Matos' via Racket Users 
>>  wrote:
>>
>>
>> Hi,
>>
>> I reached a point at which I don't think I am exactly understanding how
>> the racket compilation pipeline works.
>>
>> My software has several compile time options that use environment
>> variables to be read (since I can't think of another way to do it) so I
>> define a compile time variable as:
>>
>> (define-for-syntax enable-contracts?
>> (and (getenv "S10_ENABLE_CONTRACTS") #true))
>>
>> And then I create a macro to move this compile-time variable to runtime:
>> (define-syntax (compiled-with-contracts? stx)
>> (datum->syntax stx enable-contracts?))
>>
>> I have a few of these so when I create a distribution, I first create an
>> executable with (I use create-embedding-executable but for simplicity,
>> lets say I am using raco):
>> S10_ENABLE_CONTRACTS=1 raco exe ...
>>
>> I have a bunch of other options that don't matter for the moment.
>>
>> One of the things I noticed is that in some cases when I run my
>> executable, compile time code living inside begin-for-syntax to check if
>> a variable has been defined during compilation or not is triggered. At a
>> point, which I didn't expect any more syntax expansion to occur.
>>
>> I can't really reproduce the issue with a small example yet but I
>> noticed something:
>>
>> main.rkt:
>>
>> #lang racket
>>
>> (require (file "arch-choice.rkt"))
>>
>> (module+ main
>> (printf "arch: ~a~n" (get-path)))
>>
>> arch-choice.rkt:
>>
>> #lang racket
>>
>> (provide get-path)
>>
>> (begin-for-syntax
>>
>> 

Re: [racket-users] Compilation/Embedding leaves syntax traces

2018-09-25 Thread Alexis King
(Sorry, Paulo, for the duplicate message; I forgot to Reply All the
first time.)

This is sort of subtle. When we consider a macro-enabled language, we
often imagine that `expand` takes a program with some phase ≥1 code,
expands all the macros in the program by running the phase ≥1 code, and
produces a fully-expanded program with only phase 0 code left. There is
some truth to this, but it doesn’t paint the whole picture.

Let’s start with the things that ARE true:

   1. When a module is compiled, it is fully expanded.

   2. Fully-expanded code contains no macro uses.

   3. Instantiating a compiled module at phase 0 does not normally run
  any phase ≥1 code, unless the module uses reflective operations
  like dynamic-require that may trigger compilation of other
  modules at runtime or explicitly instantiate modules into a
  namespace at phase ≥1.

These three things align with our intuition. If you have the program

   (+ (mac) 1 2)

where `mac` is a macro, then when the module is compiled, the use of
`mac` goes away, and it is replaced with its expansion.

Now, let’s add one more true thing to the list that aligns with our
intuition, but hints at something more complicated:

   4. When a module is expanded, all LOCAL macro definitions disappear.

This means that if you define a macro with let-syntax (or, equivalently,
define-syntax in an internal definition context), then all of the code
that implements that macro goes away after expansion. This is consistent
with our intuition, but it begs the question: why does this only happen
for local macros? Shouldn’t this happen for all macros?

Sadly, no. Consider the following module:

   (module m racket
 (provide mac)
 (define-syntax (mac stx)
   ))

In this case, the RHS of the `mac` definition must remain in the
compiled code, since some other module could require `m` and use `mac`.
Although the RHS of the `mac` definition is not evaluated when `m` is
instantiated at phase 0 (as is specified by rule 3 above), it must be
evaluated during compilation of another module that uses `m`.

(The technical term for this in Racket is called “visiting” the module.
This process of evaluating the RHS of define-syntax forms during module
visits also applies to any forms inside begin-for-syntax blocks, and a
visit also instantiates any modules required for-syntax by the visited
module. The nitty-gritty details are subtle, but this explains why code
on the RHS of module-level define-syntax forms or inside
begin-for-syntax blocks must be kept around in compiled code.)

The above explains why Racket retains some phase ≥1 code. However, it
may be unsatisfying: while it’s true that the phase ≥1 code might be
necessary for compilation of other modules, once you have compiled your
whole program, it shouldn’t be necessary to keep that information
around, right? No other modules will ever need to be compiled against
the macro-providing module. However, this is not necessarily true!
Racket provides a set of reflective operations for compiling modules at
runtime, and it makes no assumptions that all modules will be loaded
from compiled code. In this sense, Racket includes an “open-world
assumption” when compiling modules, and it retains any phase ≥1 code
necessary for compiling new modules at any time.

This sort of thing is necessary to implement tools like DrRacket, which
frequently compile new modules at runtime, but admittedly, most programs
don’t do any such thing. Personally, I would appreciate a way to ask
Racket to strip all phase ≥1 code and phase ≥1 dependencies from a
specified program so that I can distribute the phase 0 code and
dependencies exclusively. However, to my knowledge, Racket does not
currently include any such feature.

For more information on declaring, instantiating, and visiting modules,
and how that relates to compilation, see this very helpful section in
The Racket Guide:

   http://docs.racket-lang.org/guide/macro-module.html

Alexis

> On Sep 25, 2018, at 07:32, 'Paulo Matos' via Racket Users 
>  wrote:
> 
> 
> Hi,
> 
> I reached a point at which I don't think I am exactly understanding how
> the racket compilation pipeline works.
> 
> My software has several compile time options that use environment
> variables to be read (since I can't think of another way to do it) so I
> define a compile time variable as:
> 
> (define-for-syntax enable-contracts?
> (and (getenv "S10_ENABLE_CONTRACTS") #true))
> 
> And then I create a macro to move this compile-time variable to runtime:
> (define-syntax (compiled-with-contracts? stx)
> (datum->syntax stx enable-contracts?))
> 
> I have a few of these so when I create a distribution, I first create an
> executable with (I use create-embedding-executable but for simplicity,
> lets say I am using raco):
> S10_ENABLE_CONTRACTS=1 raco exe ...
> 
> I have a bunch of other options that don't matter for the moment.
> 
> One of the things I noticed is that in some cases when I run my
>