Re: [racket-users] Compilation/Embedding leaves syntax traces

'Paulo Matos' via Racket Users Fri, 28 Sep 2018 02:55:36 -0700

On 26/09/2018 19:39, Philip McGrath wrote:
> On Wed, Sep 26, 2018 at 1:36 AM Paulo Matos <pmatos@linki.tools> wrote:
> 
>     I am keen on hearing about alternatives. The reason to do like this is
>     to minimize friction with clients. Clients in the area of development
>     tools expect something that they can execute and generally are not too
>     keen on scripty calls like `python foo.py`, so if I said: Please run the
>     program with `racket s10.rkt` ... it would very quickly lower my
>     possibility of a sale. Racket distribution creates essentially the
>     package that they expect, something they can run out of the box without
>     thinking much about dependencies in something that looks like an
>     executable - even if it's just more or less a racket shell. Your product
>     appearance is important and therefore I want to give something they are
>     already used to.
> 
> 
> I definitely understand wanting users to be able to run the application
> in a way that feels as normal as possible to them: I think about this
> even with internal tools that I develop for collaborators with a limited
> technical background.
> 
> I think the approach I brought up would be compatible with making
> distributions. What I had in mind (but see below) was something like:
> ;; x86_64-no-contracts.rkt
> #lang racket/base
> (require "../run-program.rkt")
> (run-program #:arch 'x86_64 #:contracts? #f)
> 
> ;; i386-with-contracts.rkt
> #lang racket/base
> (require "../run-program.rkt")
> (run-program #:arch 'i386 #:contracts? #f)
>  Then `raco exe`/`create-embedding-executable`/whatever can work on
> either version.
> 
>     > Most seriously, depending on exactly how you use these compile-time
>     > environment variables, it seems like you could negate some of the
>     > benefits of the "separate compilation guarantee," particularly if you
>     > are assuming that all of your modules are compiled at the same time.
>     >
> 
>     Why would that be a problem?
> 
> 
> This is a longer discussion and I am by no means an expert, but I can
> point you to the "separate compilation guarantee" docs
> (http://docs.racket-lang.org/reference/eval-model.html#(part._separate-compilation))
> and Matthew Flatt's paper "Composable and Compilable Macros: You Want it
> /When?/" (https://www.cs.utah.edu/plt/publications/macromod.pdf).
> 

I took the time to think through this and skim through the paper. I
understand what you mean by separate compilation guarantee. I am not
convinced what I am doing breaks the separate compilation guarantee.
>From what I understood this is broken if there are any side-effects of
the compilation process, i.e. printing, writing to files, etc. Which
there are none with the approach I take. However, I might be missing
some subtle detail. I haven't yet internalised all of the information
with regards to phase levels.

> I don't think what you are doing circumvents the separate compilation
> guarantee per se, 

+1 ah, should have read this before writing the above. :)

> because you don't produce "external effects" (e.g.
> I/O) during compilation and then rely on those side-effects during
> run-time. But, while I have not thought especially deeply about this,
> using environment variables this way seems to be sort of the mirror
> image: the state of the universe external to the Racket runtime system
> has effects on the compilation of your modules, and it seems like that
> might introduce similar problems.
> 

This might be one of those subtleties I don't quite understand and they
might happen. As I said in the OP, I am actually still having problems I
can't reproduce in a small example, that only occur when the code is
embedded in the executable.

> In particular, "the practical consequence of [the separate compilation]
> guarantee is that because effects are never visible, no module can
> detect whether a module it requires is already compiled. Thus, it can
> never change the compilation of one module to have already compiled a
> different module." This has all kinds of nice benefits that are detailed
> in the paper.
> 

This might be related to what I am seeing... which looks like this:
instantiate-linklet: mismatch;
 reference to a variable that is uninitialized;
 possibly, bytecode file needs re-compile because dependencies changed
  name: g8759.1
  exporting instance: '#%embedded:g28621:stochastic-statistics
  importing instance: '#%embedded:g26566:stochastic


On the other hand, it's strange that this only happens when the code is
embedded in the executable... so I am not sure if it's actually this or
just a bug in my phase 1 code in the embedding process.

> In your case you seem to assume that all of the modules are compiled at
> the same time (i.e. with the same set of environment variables). 

Correct.

> You do
> seem to put in the effort to actually uphold that assumption, and maybe
> there are compelling reasons to do so in your case, but I would suggest
> thinking carefully about that decision.
>  

I understand it's not the smartest thing to do for all sorts of reasons
and during testing I have been bitten more than once. So, I might take
my time to change it.

> 
>     That is a possibility but I run into the problem of having too many
>     modules as the number of possibly configurations can easily explode as I
>     add more configurations and backends to my application. Getting one file
>     per compile-time configuration is essentially not workable.
> 
>     My CI script currently creates 48 configurations for testing. That's 4
>     boolean options plus 3 backend architectures that I either completely
>     support or are under development. If I get a couple of customers wanting
>     different architectures, I can easily go to maybe 5 or 6 backends and
>     have more than 100 configurations to test. I would probably have to stop
>     testing _every_ config at some point and choose which configs are more
>     important but this seems to be the problem with this kind of choice - in
>     GCC world we have the same problem and ended up having to create several
>     tiers of backends/configurations to be able to do a proper job at
>     testing and releasing.
> 
>     If I am missing some fundamental way Racket can help with this, I am
>     open to other options but it all boils down to moving as many things to
>     compile time config so I can drop unnecessary code and try to make this
>     as fast as possible.
> 
> 
> Yes, I see the issue here. I don't have a fully worked-out solution and
> probably can't without knowing your requirements in detail, but I have a
> few general thoughts.
> 
> How much of this configuration really needs to happen at compile-time
> vs. runtime? In the examples you've sent, you effectively turn
> "S10_ENABLE_CONTRACTS" and "ARCH" into runtime constants. 

With S10_ENABLE_CONTRACTS it's more than that since contract code is
actually removed and I have a lot of time from not using contracts. I
know contracts are good but when you are competing with other platforms
written in highly optimized C++ code, every cycle counts.

> If it's just a
> matter that you don't want to ship extra code to clients, assuming you
> are already using `dynamic-require` or similar, I think
> `create-embedding-executable` or some part of that pipeline would let
> you omit unwanted collections/modules.

Yes, the arch allows me to choose a backend to compile in, instead of
compiling them all. dynamic-requiring the right backend, defined in ARCH
does that.

> Of course, you would have to test
> that you really do include in each distribution all of the modules that
> will be needed at runtime, but you have to do that anyway:
> `create-embedding-executable` won't save you from `(dynamic-require
> 'something-that-doesn't-exist)` anyway.
> 
> For the large number of configurations, I could see making a little DSL
> that generates submodules, rather than putting each configuration in its
> own file.
> 

I wonder what you are thinking here, not sure it's clear to me how it
would work.

> If you haven't already, you may want to look into Racket's signatures
> and units, which explicitly support separate compilation and linking at
> runtime. They seem to be a bit out of fashion because (as I understand
> it) they predate the module system and were formerly used, painfully, in
> places where `module` was really what was needed, but they are still a
> useful abstraction under the right circumstances. 

I actually started this work with using Units and Signatures and then
ditching them after I found dynamic require is enough... :) Since then,
I have found a number of gotchas in using dynamic-require that I didn't
expect.

> For Digital Ricoeur, I
> use them in at least two places. I use them to organize the backends for
> our search feature, including uniformly adding a lazy-initialization
> option. I also use them for testing the system that system that sends
> notification and reminder emails about requests to register for our
> website. This would ordinarily be a pain to test, because it has side
> effects (sends emails), consults a database, and waits for, say 24
> hours. With units, I can swap in alternate definitions of things like
> `alarm-evt` or our database-access functions for testing.
> 

That's a very interesting application of units. Is there a reason why
dynamic-require wouldn't work?

> Even if you don't use units directly, I have found the design useful as
> inspiration for a custom, one-off linking system (part of a DSL) that
> includes things like generating macro definitions and static checking
> specific to our domain requirements. We recently released this code as
> open-source: I can send you links if you're interested.
> 

I would definitely be interested in looking into it if you could send me
a direct link.

Many thanks for the reply, massive food for thought.

-- 
Paulo Matos

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Re: [racket-users] Compilation/Embedding leaves syntax traces

Reply via email to