On 12/17/2016 02:34 PM, Chris Wright wrote:
Just looking at this again:
The obvious workaround to the problem that dependencies must be module-
level is to simply define many small modules---in the extreme, one per
declaration.
Andrei works in phobos a lot. Phobos has a lot of large modules. For
instance, std.datetime is 35,000 lines. It's not unusual for a phobos
module to have over 6,000 lines (std.math, std.typecons, std.traits,
std.format, std.conv).
Let's take a look at that hypothesis. The example I chose randomly (and
which turned to be a rat's nest of fuzzy dependencies) was std/array.d,
clocking at 3585 lines. Then looking at the entire project:
wc -l std/*.d
std/{algorithm,container,digest,experimental,internal,net,range,regex}/**/*.d
| sort --key=1 -n | cat -n
This outputs the modules in the standard library (excluding those that
are simple header translations), sorted by LoC, numbered. See result in
http://paste.ofcode.org/Lc5xfcs8GqpT2cabApSSgk. That shows 137 modules,
median length 903, average length 2055 --- including full documentation,
unittests, and examples. These numbers seem quite reasonable and if
anything compare favorably against other projects I've been on.
I'd normally recommend breaking up modules at one fifth that size.
Yeah, std/datetime.d is a monster, from what I can tell owing to a rote
and redundant way of handling unittesting. I didn't look at its
dependencies, but I doubt they are special. I was quite vocal about
breaking it up, but I got mellower with time since (a) someone measured
its size without unittests and it was something like one order of
magnitude smaller, and (b) there was really no more trouble using or
maintaining it than with anything else in Phobos.
I should also add that each large project has a couple of outliers like
that. I even recall a switch of a couple thousand lines once :o).
The
standard library benefits from low granularity modules. It needs to
implement a variety of related tools for working with particular things.
For the hunting-for-definitions case, you also need:
* a module with more than a few imports, from different libraries or
packages
* ambiguous names, or functions that are widely used
* the user can't use an IDE / ctags / dcd
* the user can't use ddox / dpldocs.info, which turns type references
into links; or the user is using that and needs to find the definition of
a template constraint
* the maintainer cannot use selective imports
* the maintainer cannot break the module up to reduce the number of
dependencies
* the maintainer is willing to spend the effort to convert top-level
imports into tightly scoped imports
For the compilation-speed case, you need:
* large dependencies that this allows you to skip (the module combines
several types of functionality with different dependencies)
* the imported module must be in another compilation unit (incremental
compilation or a separate library)
* the dependencies can't be used by any other module in the compilation
unit
* no selective imports
* the module being compiled depends on something in the same scope
That's a pretty marginal use case.
Most of these have been the case with all C++ and D projects I've been
involved with at Facebook.
Please let me know what of this information I should include in the DIP
to make it better. Thanks.
Andrei