Geoffrey Broadwell wrote:

This is a VERY ROUGH OUTLINE of my proposal for the design of the Parrot
module ecosystem.  I'm looking for any and all comments, and I expect
many changes before we reach even rough consensus.  Please feel free to
circulate this to people who understand the problem space better than
I do!  I can use all the input I can get.

Note that there are some items explicitly marked 'undecided'.  I thought
about these enough to realize I just don't know enough yet to make a
decent recommendation, so input on these is especially appreciated.

A great proposal, thanks for taking the time to think through the issues!

************************************************************************
VERSION: 2

Requirements
------------

* General
  * Distributions should be dead easy for module authors to create,
    and for users to install.
  * We can create a centralized metadata store, but do not want to
    build and manage a module distribution network ...

Good.

  * However it should be possible for another group to do so.

<shrug> I'm not sure it is possible. Keep in mind that we're not just talking about custom Parrot modules, we're talking about the entire collected history of Perl, Python, PHP, Ruby, Lua, etc modules. That's just an insane amount of data, even Google isn't crazy enough to want to take that on.


And, an additional requirement:
* The API for extracting info from the metadata store, and the process of installing modules based on that metadata should be dead simple and clearly specified so people can clone it. If we see implementations popping up in Python, PHP, Ruby, Perl (5&6), etc it'll be a mark of success.


* Toolchain
  * Basic tools can assume Parrot and the core modules are working,
    but require no other dependencies internally.
  * All external tools needed to download/build/install modules will
    be specified in the module metadata.
  * Tools should be easy to configure.
  * Tools should attempt to auto-configure as much as possible.

Good.

  * Tools must properly handle the difference between user-local,
    site-local, and vendor-installed modules.

Not sure this really matters.

* Metadata
  * Simple, extensible format.
  * Unicode and case-retaining.
  * Must include its own spec version.

Good.

  * Sufficient for automated programs to create system packages
    (DEB, RPM, etc.).

That would be tough, but we can at least cover the simple cases, with guidance to point people toward how to extend the generated template to handle the harder cases.

  * Separate static v. configure-discovered v. hand-edited metadata.
    Separate files?

Sounds overly complex. Provide a field in the metadata for "data_source".

  * Includes fetch, configure, build, test, install, and runtime
    dependencies.
  * Should be able to track author, mailing list, bug email/bug URI,
    wiki, homepage, source repository, etc.

Good.

  * Allows disambiguation as per Perl 6 module spec (authorities,
    versions, authors, etc.).

It's just extra metadata, sure, why not. We should also make sure we can accommodate the metadata for RubyGems, PHP PEAR, etc.

  * Specifies rules for dependency string parsing/interpretation.

Not sure what you mean. More detail?

Proposal
--------

* Overview
  * Parrot community builds a module metadata search system.
  * This search system gathers metadata from various sources, and
    allows users to query it via web browser or API, but does not
    itself store the actual modules.
  * Once found, modules can be fetched from many possible sources,
    including VCS repositories, FTP mirrors, etc.
  * Parrot team will need to standardize module metadata, provide
    the libraries and tools necessary to use the search system,
    provide guidelines for extending the toolchain, and mentor the
    growth of the ecosystem until it stands on its own.

Good.

* Metadata format
  * Served metadata container is gzip'ed tarball (.tar.gz? .tgz?).
  * Core metadata is in META.json at top level of container.
  * Container includes copies of special files (e.g. README).
  * Format for specifying non-metadata-only build scripts undecided.
  * Integrity check / authentication methods undecided.
    * Probably at least md5sum and sha1sum for source tarballs,
      but what about when pulling from raw VCS repo?

You're still thinking CPAN (the best technology 1995 had to offer).

The primary interface should be a web form where people can enter metadata about their module. They should also be able to *update* the information stored there, to mark an older version as deprecated, that a module is no longer maintained, change the owner(s), change the URI for download, or to remove a module entirely. (Look at Launchpad.net for inspiration.)

From the form, we can generate a JSON dump of any module's metadata. We can also accept a JSON block as an alternate input source, so someone can keep a copy of the .json file checked into their repository, make a few changes and paste it in the web form when releasing the next version.

The metadata shouldn't keep copies of any files from the module distribution, though it should have space for a description. (If someone's lazy they might paste in the entire README, but that's generally about building a module, and so not appropriate for someone who's looking for general information about it, a.k.a. "Do I want this module?".)

* Core modules
  * parrot config  (already exists -- config.pir)
  * HTTP client    (at least GET, with redirect and proxy support)
  * zlib           (at least decompress)
  * tar            (at least extract)
  * JSON           (at least parse)
  * version spec   (at least parse and compare)
  * library probe  (shared library info: present? version? location?)
* file paths (portability: File::Spec + File::Basename + ...) * file install (portability: copy file, set file perms, etc.)
  * query metadata (perform API calls to metadata/search server)
  * installer lib  (all the real brains/glue for the module repo client)
  * installer ui   (CLI and/or Readline, minimal brains, uses lib)

Too heavy weight. And honestly, I think it's backwards. Aviary should be a standalone "Pack" that has Parrot as its first dependency. (If you have Parrot installed, great, if not, it'll install it for you. It's pretty much what Rakudo does already.)

* Basic Batteries modules
  * Full versions of any modules that are limited in Core
  * Installer add-ons:   VCS fetch/use system pkgs/full depresolve/etc.
  * Module author tools: create/register/update/upload/etc.
  * PIR-level tools:     disassembler/debugger/profiler/data dumper
  * NCI tools:           parse header/manage typemap/wrap C struct/etc.
  * Standard interfaces: TAP, DBDI, logging, ?
  * Standard libraries:  OpenSSL, DateTime, temp dir/file, ?

* Possible Power Packs (NOTE: *EXAMPLES ONLY*, DON'T BIKESHED!)
  * Database:     DBDs (drivers), SQL clients, per-HLL DBI variants
  * Testing:      smoke/tinder/smolder clients, per-HLL Test::* variants
  * Security:     SSH, GPG, libpcap, ...
  * Unixen:       POSIX, Fcntl, Errno, ...
  * Markup:       YAML, libxml2, Expat, DOM, SAX, ...
  * VCS:          CVS, Subversion, git, Mercurial, ...
  * Email:        POP, IMAP, SMTP, MIME, ...
  * GUI:          Qt, GTK+, Wx, Tk, ...
  * 2D Graphics:  libpng, GD, SDL, Cairo, ...
  * 3D Graphics:  OpenGL, EGL, GLU, ...
  * Sound:        OpenAL, Pulse Audio, JACK, ...
  * Game Support: Require other Power Packs: Audio, 2D/3D Graphics

Details of what modules go where can come later. First step is to get Pack installs working. (Basic Batteries is just a small Pack.)

Aviary could include tools to make it very easy to create a Pack from a simple JSON file (a list of modules, and a title/description for the Pack). All Packs should be standalone, installing Parrot and Aviary if needed.

* Misc recommendations
  * Separate 'parrot-modules' mailing list for module creators/users.

Nothing kills a good idea faster than shoving it off on a separate mailing list that no one reads. If module-specific traffic ever seems to be overwhelming parrot-dev we can split it off. (Since we got the ticket traffic off parrot-dev, traffic is quite tolerable now.)

  * Default to simple (CPAN-style) dependency resolution; upgrade to
    full resolution and system package awareness in Basic Batteries.

Not sure what you mean. More detail?

  * Names so far suggested for module repository network:
    + Aviary

Love it. Very Parroty. aviary.parrot.org?

    + CPAAN
    + FPAN

Both would lead people to expect CPAN, which would be limiting to us, and frustrating to them when they find out it's not.

Metadata Proposal
-----------------

* Required fields:
  * meta-spec
    + version
    + uri
  * name
  * authority
  * version
  * license
    + type
    + uri
  * copyright_holder
  * abstract

Good.

* Manifest fields:
  * files
    + configure
    + build
    + test
    + install
      - share
      - docs
      - bin
      - lib
      - runtime

Skip the manifest, it's a pile of duplicated data that's only needed by the build process (by the time you start the build process, you have the tarball anyway).

* Dependency fields (as { [dep_name]: [version_spec], ... }):
  * provides
  * conflicts
  * requires
    + fetch
    + configure
    + build
    + test
    + install
    + runtime

Good.

* Optional features fields:
  * optional_features
    + [feature_name]
      - description
      - [any/all dependency fields as needed]

Good.

* Other optional fields:
  * description
  * keywords
  * generated_by
  * contributors
    + authors
    + maintainers
    + translators
    + testers
    + reviewers
  * resources
    + source
    + homepage
    + bugtracker
    + wiki
    + repository
      - type
      - checkout_uri
      - browser_uri
      - project_uri
    + mailinglists
      - [list_name]
        . address
        . uri

Good.

* Undecided fields:
  * dynamic_config
  * no_index
  * digests
  * signatures

We should be prepared for the format to grow and change over time, possibly allowing custom fields.


I don't see anything here for "standard build instructions". As in, the specific command-line instructions for "configure", "build", "test", and "install". These could allow variable substitutions from parrot_config (or Aviary's collected configuration information), so Rakudo's "perl Configure.pl" could be "@perl@ Configure.pl", while Pynie's "python setup.py build" could be "@python@ setup.py build", and a general 'make test' could be "@make@ test" (to allow for nmake, etc).

Allison
_______________________________________________
http://lists.parrot.org/mailman/listinfo/parrot-dev

Reply via email to