On Fri, Dec 2, 2011 at 8:21 PM, Jeffrey Thalhammer
<j...@imaginative-software.com> wrote:
> Hi everyone-
>
> I need some suggestions for terminology to use in my code and documentation.  
> I'm picky about names, so this is important to me (perhaps more than it 
> should be).  The context is Pinto, which is yet-another suite of libraries 
> and tools for building a private CPAN-like repository.  Here's what I have so 
> far...

I would encourage you to use existing names/conventions whenever
possible.  Some references to consider if you haven't.

http://www.dagolden.com/index.php/308/packages-modules-and-distributions/
https://metacpan.org/release/URI-cpan
https://metacpan.org/module/CPAN::DistnameInfo (which you've seen)

> Distribution:  A Distribution is an abstract concept that defines 
> relationships between packages.  The minimal concrete implementation of a 
> Distribution would be just a META.json (or equivalent) file.  Distributions 
> also have names and versions like Foo-Bar-1.2

No.  At best a distribution is a collection of zero or more modules.
(It could be all scripts and no modules.)  META is not required.

> Distribution Archive:  A Distribution Archive is the physical manifestation 
> of a Distribution, and corresponds to an actual file on the local disk.  For 
> example, /home/jeff/Foo-Bar-1.2.tar.gz or C:\MyDocuments\Foo-Bar-1.2.tar.gz

I don't think you gain much by distinguishing this from distribution.
If you need to, I would consider "abstract distribution" for the
concept of a collection of modules and "distribution" for the archive
file as that is how it's commonly referred to elsewhere.

> Distribution Path:  The Distribution Path is how an Archive is identified in 
> a CPAN index.  It is basically a URL fragment that looks like 
> A/AU/AUTHOR/Foo-Bar-1.2.tar.gz.  This is the term I'm having the most trouble 
> with.  CPAN::DistnameInfo calls this the "prefix" but I don't really like 
> that either.

A distribution file can only be uniquely identified on CPAN by
AUTHOR/Foo-Bar-1.23.tar.gz.  This is why I think that separating the
concept of distribution from path is problematic.  If you define
distribution to be the tarball, the the name of the distribution is
AUTHOR/Foo-Bar... etc.  (The "A/AU" is unnecessary as it can be
computed)

It's what rjbs and I called a cpan::distfile URI.

If you need, you might discuss  "distribution name" as the bundle that
may span multiple releasing authors, but with the huge caveat that two
distribution files with the same "name" (sans author) have anything to
do with each other.

    DAGOLDEN/Foo-1.23.tar.gz
    THALJEF/Foo-1.23.tar.gz

These might be the same inside or they might be different.  My Foo
distribution could contain module ACME::Foo and yours might contain
Acme::FOO.  Those would be two totally legal, indexable distributions
as far as PAUSE is concerned.

Likewise "distvname" from DistnameInfo is a useful concept, but not
sufficient to identify uniqueness.

My general conclusion is what I said in my blog post.  A
"distribution" is an archive file 'AUTHOR/Tarball-version.suffix' that
contains 0 or more modules.  That is the only definition that doesn't
get you into trouble with edge cases.

> Package:  A package is just a package, in the usual way.  That is, something 
> declared with the "package" keyword.
> Module: I actually avoid using the term Module because I think it is often 
> misused.

Agreed.  See my blog post about it.  A "Module" is something that you
can give to "use" or "require" and get a thing loaded.  It may contain
zero or more Packages (i.e. namespaces)

> Repository: A repository is a general term for any CPAN-like pile of files.  
> This includes CPAN mirrors, as well as any DarkPAN or mini-cpans.  A 
> repository has a URL that identifies the entry point.  For example: 
> http://cpan.perl.org

Fine, though you imply but don't state that a repository contains
distributions in a particular directory layout below the base URL.

-- David

Reply via email to