On Wednesday, 5 June 2013 at 18:21:04 UTC, H. S. Teoh wrote:
On Wed, Jun 05, 2013 at 01:20:48PM -0400, Jonathan M Davis
wrote:
On Wednesday, June 05, 2013 14:02:37 Jakob Ovrum wrote:
> We have a standard library in disagreement with the
> language's
> encapsulation mechanics. The module/package system in D is
> almost
> ignored in Phobos (and that's probably why the package system
> still has all these little things needing ironing out). It
> seems
> to owe influence to typical C and C++ library structure,
> which is
> simply suboptimal in D's module system.
I honestly don't see how Phobos is in disagreement with the
module
system. No, it doesn't use hierarchy as much as it should, and
there
are a few modules that are overly large (like std.algorithm or
std.datetime), but for the most part, I don't see any problem
with its
level of encapsulation. It's mainly just its organization
which could
have been better. My primary objection here is that it seems
ridiculous to me create lots of tiny modules. I hate how Java
does
that sort of thing, but there you're _forced_ to in many cases,
whereas we have the opportunity to actually group things
together in a
single module where appropriate. And having whole modules with
only
one or two functions is way too small IMHO, and that seems to
be what
we're proposing here.
[...]
As Andrei pointed out, I think we need to look at this not from
a size
perspective (number of lines, number of functions, etc.), but
from an
API perspective: do these functions/structs belong together, or
are they
only marginally related? More precisely, if some user code uses
function
X, is that code equally likely to also use Y? Are there common
use cases
in which only Y is used, not X?
If the use of function X almost always implies the use of
function Y
(and vice versa), then they belong in the same module.
Otherwise, I'd
say they are candidates for splitting up.
If function X uses function Z, and function Y also uses
function Z, but
the use of X does not necessarily imply the use of Y (and vice
versa),
then I'd argue that X, Y, and Z should be in separate modules to
maximize reuse and reduce the amount of code you have to pull
in (you
shouldn't be forced to pull in Z just because you use X which
calls Y,
which Z happens to also call).
This may be a bit heavy-handed for user code, but for Phobos,
the
standard library, I think the bar should be set higher. After
all, one
of the stated goals of Phobos is that you shouldn't need to
pull in a
whole ton of code just because you call a single function.
Right now I
think we're a bit short of that goal.
Massive +1
Modules are for grouping functions/types that are commonly used
together or have interdependencies, not for grouping things that
are in a similar category (although these things can be related).
I don't care if levenshteinDistance is a "classic algorithm", I
don't want to have to compile it every time I want to take the
minimum of two numbers. Barely anyone is ever going to use it, so
it should be off in a module on its own.
There's absolutely nothing wrong with having lots of small
modules provided that you don't end up importing the same sets of
modules over and over. There are numerous advantages:
1. Makes it easier to manage dependencies.
1a. reduces compile times.
1b. reduces binary size.
1c. benefits incremental and distributed/parallel compilation.
2. Makes version control easier as more files means merge
conflicts are less likely.
3. Makes it easier to navigate files.
The only downside is that you may occasionally have to import
more modules.