I used to think that we'd take advantage of the package manager by gradually pulling parts out of the Racket git repo and making them packages.
Now, I think we should just shift directly to a small-ish Racket core, making everything else a package immediately. "Core" means enough to run `raco pkg'. A key point to remember is that "package" does not mean "omitted from the distribution". Instead, we'll construct a distribution by combining the core with a selected set of packages. Initially the selected set of packages will cover everything in the current distribution. Jay and I have been lining up the pieces for this change (it's difficult to make a meaningful proposal without trying a lot of the work, first), and I provide a sketch of the overall plan below. This plan has two prominent implications: * The current git repo's directory structure will change. Anyone who currently works with the Racket repo will need to adapt to the new directory structure (and probably git <submodules in the future). All of the code currently in the Racket git repo will stay there (for now), but using it will involve at least one new step: linking packages within the repo into the core build --- probably by running some setup script. * The main Racket distributions at http://racket-lang.org/download/ will omit sources, including ".rkt" files, ".scrbl" files, and tests. Sources will remain readily available through the git repo and through the package manager, but getting users to try a source-code change will be less convenient than now. See "Binary Builds" below. Repository Reorganization ------------------------- To convert the current monolith into a core plus packages, we propose to reorganize the Racket git repository by 1. pushing the current content into a "core" subdirectory, and 2. lifting pieces back out of "core" and into new subdirectories, one for each package. The resulting repo will have top-level directories with names like "core", "scribble", "gui", "slideshow", "drracket", and so on. Each directory other than "core" corresponds to a package. We'll have to try this out to discover how finely we can break up the existing tree into packages. At worst, the "mr", "dr", and "plt" layers of "dist-specs.rkt" should work, but I think we'll be able to do better than that. Eventually, when the dust settles, I think we'll want to convert every directory to its own git repo, and then we can incorporate the individual repos as git submodules. Rearranging the repo will obviously break the current build system. Jay and I are creating a new build system, so the current nightly build and distribution processes do not need to adapt (although we're using many existing pieces). The new build system might be ready by the end of the week (and the repo reorganization will wait until the build system is ready). Binary Builds ------------- The proposed switch to binary distributions --- instead of always including source alongside generated bytecode and documentation --- is aimed at reducing dependencies between packages. Support for binary packages is also aimed at supporting faster installs. In terms of dependencies, documentation for a library usually has more dependencies than the library itself. We don't want to limit the *documentation* for package X to avoid using or referring to package Y libraries in order to avoid a run-time dependency of X on Y. For that matter, we don't want to avoid documenting X in order to avoid a dependency on Scribble. A library's tests similarly could have dependencies that are not needed for the library itself. We've adjusted `raco setup' and `raco pkg' to work with collections and packages that are in binary form. "Binary" is not a specific attribute of a package; it's just a package that happens to have ".zo" files without corresponding ".rkt" files, documentation without ".scrbl" sources, and so on. The intent is not for programmers to create binary packages, but to enable an automatic conversion of a source package to a binary package. We can then set up different catalog severs to serve source and binary versions of packages. Finally, we'll be able to quickly create distributions --- either the standard one or others --- by combining a core build with a set of binary packages. Some drawbacks to omitting source are immediately apparent: - Users will be less able to make source changes on their systems to help us debug. Having the binary form of a package installed does not preclude "upgrading" to a source package. So, we could ask a user to use the package manager to install the source form of, say, the "drracket" package, and then try out a change. In that way, users can still help, but it will be less convenient. - Users will be less able to read installed code as examples. Our source code is now easily available via the web interfaces at http://git.racket-lang.org/ and GitHub, so users can always look there, instead. It would be possible, of course, to support distributions and packages that include both source and compiled forms (like our current distribution), but that arrangement requires even more work. We'd like to try out the simpler source vs. binary options, first. More Detail ----------- Here's our plan for a new repo and build process: * There's a Racket core, which will look a lot like this: https://github.com/mflatt/min-racket The core is intended to provide everything that is needed to run `raco pkg', which is the way to install anyything else. The repo above is not yet minimal in that sense, but I think it's close. After `make install', a simple tool (probably a new mode for `setup/unixstyle-install') can copy a built tree to distribution form. The copy strips sources for which bytecode files exist. The core has no documentation or even documentation sources, and dropping sources for a distribution includes dropping "tests" subdirectories. In the distribution copy, the default package catalog is switched to a binary package server, instead of a source package server. (Developers can continue to work with an in-place build in package-source mode, as usual.) * For each package to be included in the distribution, a build machine installs and strips each package to binary form, where "binary" form means that sources are removed while bytecode is kept, etc. A package's `deps' in "info.rkt" should describe all run-time dependencies. Additional build-time dependencies must be specified by `build-deps', so the complete set of dependencies when building a package from source is `deps' plus `source-deps'. (In the long run, we can add machinery to check these dependency declarations.) Stripping to binary form adjusts "info.rkt": the `build-deps' entry is dropped, and the `scribblings' entry is rewritten to `rendered-scribblings' to install pre-built documentation, and so on. Fields in "info.rkt" can fine-tune the stipping process for a package or collection. A package must not depend on anything in another package that is stripped away for binary form. Rendered documentation for a binary package redirects to a server for any link that goes outside the document. Every package's documentation therefore makes sense by itself, so it is easy to include rendered documentation in a package; meanwhile, the documentation server will be populated with built packages. When a package is installed with its documentation, links are redirected to local copies for any documentation that is locally installed. * On the N platforms for which we want to provide distributions, we build the core, install binary packages (i.e., the ones to be included in the distribution), and then convert to an installer. There should be no need for "dist-specs". The distinction between source and binary is mostly implicit and otherwise specified per-package as part of package stripping. A binary installer is packed directly from a binary installation. The build and installer-creation scripts will themselves be packages, so anyone can run them. Also, we hope to eventually offer a service takes takes a set of packages and produces a set of installers that are preconfigured with the given packages. As a minor point, platform-specific packages must be created in binary form in the first place (i.e., there is no source form), since the creation of binary packages from source form will happen only on one platform. For example, the current `make'-time download of GUI libraries on Windows and Mac OS X will turn into platform-specific package dependencies, and the packages are straightforwardly created as binary in the first place. * There's just one core source distribution --- not different ones for Unix, Windows, and Mac --- that is derived directly from the git repo. We envision no distribution that includes both source and compiled bytecode/documentation, which is the form that our current distributions take. We could conceivably support such distributions one day, either by building from source on N machines or having a third kind of package that includes both source and compiled parts, but this is a good place to simplify at first. Some pieces yet to be implemented: - Stripping a core build to prepare for binary package installs (looks easy). - Submodule stripping when converting a package to binary form (looks easy). - Scripts and servers to drive (1) the core and package build once, and (2) the core builds, package installs, and installer bundles on various platforms. Some complexities of the current build/bundle process that go away: - No "dist-specs"; no mz/mr/dr/plt spec. - No "info-domain" fixup when packaging a distribution. - No extracting of binaries from one platform an splicing them into a generic build shell (or construction of the generic build shell). - No "src" distribution variants. Some complexities that stick around: - DESTDIR mangling and `setup/unixstyle-install' shuffling. - Some process for taking a pile of installers and putting them on a web page. - DrDr builds and tests from source, including ring-0 packages. _________________________ Racket Developers list: http://lists.racket-lang.org/dev