import (long)

Felix Winkelmann Tue, 01 Jul 2014 15:51:06 -0700

Ok, let's start from scratch... 

* We can't change the existing machinery without breaking an awful lot
  of code, so any solution must be an addition to what we currently
  have.


* The basic entities we have to deal with are "compilation units",
  bodies of code, either statically linked into an executable (or
  library) or dynamically loaded.

* These compilation units may or may not contain one or more
  "modules", which are separate namespaces (or "bindings") over those
  bodies of code.

* "import" incorporates bindings into the current environment, either
  globally or inside another namespace (module). We want to
  _automatically_ make the code associated with that namespace
  available, regardless of the nature of the compilation unit that
  contains that code. Is this interpretation correct?

* Making the code inside a compilation unit available happens either
  by loading, resulting at one point of time in a call to "load" (this
  includes interpreted code in source form, which is just another
  flavor of a compilation unit), or it happens by declaring an
  externally available entry-point, currently via "(declare (uses
  ...))".

  (This needs a more obvious or natural syntax at some point, but 
  that isn't relevant right now)

* Declaring an entry-point into the current compilation unit
  (basically the current source file) takes place by "(declare (unit
  ...))".

* The last 2 points are important if we want to support static
  linking. Loading is in this case the simpler operation, as the
  entry-point always has the same name. For static linking the
  entry-points need to be named differently (there might be ways
  around this limitation, but to keep things simple, let's not
  consider that right now.)

* So, if we create a "registry" of linked/loaded compilation units,
  "import" can consult this registry and check whether a compilation
  unit of the same name is already registered and, if not, default to
  loading a ".so" or ".scm" with the same name. If the latter is not
  found, we have an error. If it is found, add it to the registry.

* "import" incorporates bindings from a set of available modules, also
  registered somewhere, specifically in ##sys#module-table. Should it
  also handle compilation-units for which no bindings exist (i.e. all
  bindings are unqualified)? This is only useful at toplevel, or, in
  other words, not inside a module. This will also bring up the
  question whether such a behaviour might lead to head-scratching in
  case a module should exist, but the binding-information is
  unavailable for some erroneous reason.

* Declaring an externally available entry-point must add the
  compilation unit associated with it to the registry.

(Sorry, now it gets complicated...)

* libchicken contains a number of entry-points, one for each library
  unit that comes with the core system. The registry must already have
  entries for these. Users might want to have to use a similar
  physical structure of their code, so we will have to provide means
  to add "default" registry entries, I think (I'm not completely
  sure right now - the resolution of the entry-points happens
  automatically by the linker, but we have to make later "import"s
  aware of this.)

* Currently "(declare (unit ...))" calls the entry-point,
  _initializing_ the compilation unit. Later "import"s will just
  incorporate the bindings. Do we want to initialize the compilation
  unit on the first "import"? If yes, we need to separate the notions
  of declaring an externally available entry-point and calling it, the
  latter being done (we hope) transparently by "import".

* The same situation arises with loaded compilation units. Consider a
  dynamically loaded ".so" that holds several compilation units: When
  is the entry-point of each contained compilation unit called? On
  first "import"? I this case it makes sense to generalize this, I
  think.

* The different actions or declarations will need different constructs
  to implement the low-level behaviour. Not all of them need to be
  user-visible. "import" naturally will. Declaring the current
  compilation unit to have a separately named entry point will do so
  as well.  Declaring an externally available entry-point will. And
  finally something for registering a "default" (admittedly for those
  special occasions...)

* The registry needs to be something more extensible than a simple
  "feature" list. We have to keep track of what is initialized, and so
  on. Using any existing mechanism will only make it harder to later
  remove the old code and make the existing code even more complicated
  than it already is.

* Changing the semantics of "import" for "late" initializing of
  compilation units breaks backwards compatibility, but we don't want
  to create yet another special form, right? The conservative solution
  is to do initialization at the point where an externally available
  entry point is declared or code is explicitly loaded, like it
  currently is implemented.

  (Side note: loading invokes the default entry point "C_toplevel",
  declaring an externally available entry-point invokes the
  entry-point derived from the name of the compilation unit. In the
  case of an ".so" holding several compilation units, we have a
  mixture of default entry-point + separately named entry-points.  Oh,
  this is fun...)

* Thinking of this now, I realize that the compilation unit itself
  might already contain the binding-information - this is the case
  when we compile a module without emitting an import library. So late
  initialization actually doesn't work, unless we want to require
  import libraries in any case. A valid approach, but this may have
  again other implications.

* It would be nice to have some terminology for those "bodies of code"
  that we can use to invent new special forms to cleanly perform the
  above mentioned "actions". This will of course increase the
  confusion in the beginning, but we can deprecate the old forms at
  some point.

I'm sure I have forgotten something, but it is important that we think
of all possible use cases before anything is changed, or we really
start going into details.

Note that our current CHICKEN does even more than this:
"require-extension" handling feature-IDs, for example. Or
automatically loading syntax-extensions. It's not a coincedence that
handling extensions/using/importing is in part done by a procedure
called "##sys#do-the-right-thing". And then there is figuring out
where the extensions are located, or telling the compiler what units
are loaded, or handling the "(srfi N ...)" extension-specifier even in
the presence of module-binding modifiers like "rename". Wheels within
wheels - it's terrible...

All that nasty lowlevel stuff does not necessarily have to be touched,
but care must be taken before we lock down what is in the future to be
allowable and what not. This is kind of obvious, but I just wanted to
mention it once more.

I hope I haven't raised the confusion to unbearable levels.  My
intention was to clear things up, but I have my doubts whether this
was succesful. 


felix

_______________________________________________
Chicken-hackers mailing list
[email protected]
https://lists.nongnu.org/mailman/listinfo/chicken-hackers

[Chicken-hackers] simplifying loading/linking/import (long)

Reply via email to