Re: what are guidelines for when to split a module into a package?

2018-02-22 Thread H. S. Teoh via Digitalmars-d
On Thu, Feb 22, 2018 at 03:53:59PM +0100, Timon Gehr via Digitalmars-d wrote:
> On 22.02.2018 08:13, Timothee Cour wrote:
> > Advantages of splitting:
> > ...
> > * easier to edit (no need to scroll much to see entirety of module
> > we're editing)
> 
> I don't think this particular point is true. Splitting can actually
> make editing slightly harder. Scrolling is an inefficient way to find
> the definition to edit; just search the file for the definition.

+1.

A worthwhile coding editor will have built-in functionality to search
for symbols and making bookmarks.

In Vim, I usually mark several important places in the code, e.g., a
mark on the class/struct definition that I'm working on, another mark on
a related template, another mark on the function, another mark on the
associated unittest, etc.. This lets me jump to any relevant place in
two keystrokes, without needing to care exactly where it is in the file.
I also have a semi-standard assignment of marks, and update them as I go
along, so I can always switch between different parts of the
functionality I'm working on.  For looking up other things, searching
for symbols and ctags with jump-to-definition is invaluable.

Also, Vim's ability to split the window at any time is invaluable: it
lets me, for example, work on a unittest while having the function body
in view or vice versa. Or jump between the two and work on two places in
the same file at the same time.

Any editor/IDE incapable of such basic functionality ought to be thrown
out the window and replaced with something more usable.


T

-- 
GEEK = Gatherer of Extremely Enlightening Knowledge


Re: what are guidelines for when to split a module into a package?

2018-02-22 Thread Timon Gehr via Digitalmars-d

On 22.02.2018 08:13, Timothee Cour wrote:

Advantages of splitting:
...
* easier to edit (no need to scroll much to see entirety of module
we're editing)


I don't think this particular point is true. Splitting can actually make 
editing slightly harder. Scrolling is an inefficient way to find the 
definition to edit; just search the file for the definition.


Re: what are guidelines for when to split a module into a package?

2018-02-22 Thread Seb via Digitalmars-d
On Thursday, 22 February 2018 at 08:04:19 UTC, Timothee Cour 
wrote:
it actually does reduce compilation times if the imports go 
directly to the module in
question rather than to a module that publicly imports the 
symbols


time1=compilation time of `import std.algorithm : find;` before 
split time21=compilation time of `import std.algorithm : find;` 
after split time22=compilation time of `import 
std.algorithm.searching : find;` after split


my understand is that we have:
time22 < time1 but time21 ~= time1
so we're in no way worse than before the split
unless time21 > time1 (noticably) in which case you have a 
strong argument



On Wed, Feb 21, 2018 at 11:57 PM, Jonathan M Davis via 
Digitalmars-d  wrote:
On Wednesday, February 21, 2018 23:48:32 Timothee Cour via 
Digitalmars-d wrote:

```
import std.algorithm.searching : find;

not

import std.algorithm : find;
```

that's just a missed opportunity to benefit from the split; 
we're in no way worse after the split than before the split 
in that regard. We can just leave it as `import std.algorithm 
: find;` with no adverse effect.


Maybe, but the CI stuff for Phobos doesn't like that, and it 
actually does reduce compilation times if the imports go 
directly to the module in question rather than to a module 
that publicly imports the symbols.


- Jonathan M Davis


Regarding import time:

Apart from the fact that you usually end up importing almost 
entire Phobos anyways, the import cost for importing entire 
Phobos is about 0.2s on my system with the upcoming 2.079 and we 
are working on getting it even further towards 0 which shouldn't 
be too hard (there's still a lot of unnecessary work being done 
by default).


I'm really looking forward to the point where we can just tell 
people to use `import std;` because there's almost no overhead.
(I am aware that there are downsides to importing everything but 
that's not what this discussion is about)


Re: what are guidelines for when to split a module into a package?

2018-02-22 Thread Timothee Cour via Digitalmars-d
> that doesn't help anyone who's actually reading the documentation and trying 
> to find stuff that way

how about the following fix for that:
having a DDOC token on a package.d to indicate merging the submodules
in the documentation, eg:

```
/// MERGE_SUBMODULES
std/aglorithm/package.d
```

when user browses to https://dlang.org/phobos/std_algorithm.html, he
would see the DDOC contents from all direct submodules of
std/aglorithm/ right there in that ddoc page.




On Thu, Feb 22, 2018 at 12:04 AM, Timothee Cour
 wrote:
>> it actually does reduce compilation times if the imports go directly to the 
>> module in
> question rather than to a module that publicly imports the symbols
>
> time1=compilation time of `import std.algorithm : find;` before split
> time21=compilation time of `import std.algorithm : find;` after split
> time22=compilation time of `import std.algorithm.searching : find;` after 
> split
>
> my understand is that we have:
> time22 < time1 but time21 ~= time1
> so we're in no way worse than before the split
> unless time21 > time1 (noticably) in which case you have a strong argument
>
>
> On Wed, Feb 21, 2018 at 11:57 PM, Jonathan M Davis via Digitalmars-d
>  wrote:
>> On Wednesday, February 21, 2018 23:48:32 Timothee Cour via Digitalmars-d
>> wrote:
>>> ```
>>> import std.algorithm.searching : find;
>>>
>>> not
>>>
>>> import std.algorithm : find;
>>> ```
>>>
>>> that's just a missed opportunity to benefit from the split; we're in
>>> no way worse after the split than before the split in that regard. We
>>> can just leave it as `import std.algorithm : find;` with no adverse
>>> effect.
>>
>> Maybe, but the CI stuff for Phobos doesn't like that, and it actually does
>> reduce compilation times if the imports go directly to the module in
>> question rather than to a module that publicly imports the symbols.
>>
>> - Jonathan M Davis
>>


Re: what are guidelines for when to split a module into a package?

2018-02-22 Thread Timothee Cour via Digitalmars-d
> it actually does reduce compilation times if the imports go directly to the 
> module in
question rather than to a module that publicly imports the symbols

time1=compilation time of `import std.algorithm : find;` before split
time21=compilation time of `import std.algorithm : find;` after split
time22=compilation time of `import std.algorithm.searching : find;` after split

my understand is that we have:
time22 < time1 but time21 ~= time1
so we're in no way worse than before the split
unless time21 > time1 (noticably) in which case you have a strong argument


On Wed, Feb 21, 2018 at 11:57 PM, Jonathan M Davis via Digitalmars-d
 wrote:
> On Wednesday, February 21, 2018 23:48:32 Timothee Cour via Digitalmars-d
> wrote:
>> ```
>> import std.algorithm.searching : find;
>>
>> not
>>
>> import std.algorithm : find;
>> ```
>>
>> that's just a missed opportunity to benefit from the split; we're in
>> no way worse after the split than before the split in that regard. We
>> can just leave it as `import std.algorithm : find;` with no adverse
>> effect.
>
> Maybe, but the CI stuff for Phobos doesn't like that, and it actually does
> reduce compilation times if the imports go directly to the module in
> question rather than to a module that publicly imports the symbols.
>
> - Jonathan M Davis
>


Re: what are guidelines for when to split a module into a package?

2018-02-22 Thread Jonathan M Davis via Digitalmars-d
On Wednesday, February 21, 2018 23:44:49 Timothee Cour via Digitalmars-d 
wrote:
> >  it's harder to find symbols
>
> i don't understand this argument.
>
> ```
> dscanner --declaration startsWith
> ./std/algorithm/searching.d(4105:6)
> ./std/algorithm/searching.d(4195:6)
> ./std/algorithm/searching.d(4265:6)
> ./std/algorithm/searching.d(4301:6)
> ```

I have no intention whatsoever of using dscanner. I can do something similar
with grep, but either way, that doesn't help anyone who's actually reading
the documentation and trying to find stuff that way, which is often what
people do.

Regardless, the more split up stuff is, the more places that you have to
look for it. We already have plenty of complaints about folks not knowing
whether something is in std.algorithm, std.array, std.range, and std.string
without getting into issues of sub-modules.

And I really don't understand why splitting up a module like std.array would
help anything except maybe import times, but that only matters if the module
is large (which std.array definitely isn't) and if you're importing the
modules as specifically as possible and not using public imports through
something like std.array.package.

- Jonathan M Davis



Re: what are guidelines for when to split a module into a package?

2018-02-22 Thread Jonathan M Davis via Digitalmars-d
On Wednesday, February 21, 2018 23:48:32 Timothee Cour via Digitalmars-d 
wrote:
> ```
> import std.algorithm.searching : find;
>
> not
>
> import std.algorithm : find;
> ```
>
> that's just a missed opportunity to benefit from the split; we're in
> no way worse after the split than before the split in that regard. We
> can just leave it as `import std.algorithm : find;` with no adverse
> effect.

Maybe, but the CI stuff for Phobos doesn't like that, and it actually does
reduce compilation times if the imports go directly to the module in
question rather than to a module that publicly imports the symbols.

- Jonathan M Davis



Re: what are guidelines for when to split a module into a package?

2018-02-21 Thread Timothee Cour via Digitalmars-d
```
import std.algorithm.searching : find;

not

import std.algorithm : find;
```

that's just a missed opportunity to benefit from the split; we're in
no way worse after the split than before the split in that regard. We
can just leave it as `import std.algorithm : find;` with no adverse
effect.


On Wed, Feb 21, 2018 at 11:44 PM, Timothee Cour
 wrote:
>>  it's harder to find symbols
>
> i don't understand this argument.
>
> ```
> dscanner --declaration startsWith
> ./std/algorithm/searching.d(4105:6)
> ./std/algorithm/searching.d(4195:6)
> ./std/algorithm/searching.d(4265:6)
> ./std/algorithm/searching.d(4301:6)
> ```
>
>
> On Wed, Feb 21, 2018 at 11:31 PM, Jonathan M Davis via Digitalmars-d
>  wrote:
>> On Wednesday, February 21, 2018 23:13:33 Timothee Cour via Digitalmars-d
>> wrote:
>>> from my perspective it makes sense to split a module M into submodules
>>> A, B when:
>>> * M is large
>>> * there's little interaction between A and B (eg only few symbols from
>>> A are needed in B and vice versa)
>>> * A and B are logically grouped (that is domain specific)
>>> * it doesn't turn into an extreme (1 function per module)
>>>
>>> Advantages of splitting:
>>> * easier to review
>>> * easier to edit (no need to scroll much to see entirety of module
>>> we're editing)
>>> * less pollution from top-level imports as they only affect submodule
>>> (likewise with top-level attributes etc)
>>> * more modular
>>> * doesn't affect existing code since `import M` will continue to work
>>> after M is split into a package
>>> * less memory when using separate compilation
>>> * allows fine-grained compiler options (eg we can compile B with `-O` if
>>> needed) * allows to run unittests just for A instead of M
>>> * allows selective import in client to avoid pulling in too many
>>> dependencies (see arguments that were made for std.range.primitives)
>>>
>>> Disadvantages of splitting:
>>> * more files; but not sure why that's a problem so long we don't go
>>> into extremes (eg 1 function per module or other things of bad taste)
>>>
>>> ---
>>> while working on https://github.com/dlang/phobos/pull/6178 I had
>>> initially split M:std.array into submodules:
>>> A:std.array.util (the old std.array) and B:std.array.static_array
>>> (everything added in the PR)
>>> IMO this made sense according to my above criteria (in this case there
>>> was 0 interaction between A and B), but the reviewers disagreed with
>>> the split.
>>>
>>> So, what are the guidelines?
>>
>> It's decided on a case-by-case basis but is generally only done if the
>> module is quite large. std.array is not particularly large. It's less than
>> 4000 lines, including unit tests and documentation, and it only has 18
>> top-level symbols.
>>
>> Also, remember that within Phobos, imports are supposed to be as localized
>> as possible - both in terms of where the import is placed and in terms of
>> selective imports - e.g. it would be
>>
>> import std.algorithm.searching : find;
>>
>> not
>>
>> import std.algorithm : find;
>>
>> which means that splitting the module then requires that all of those
>> imports be even more specific. User code can choose to do that or not, but
>> it does make having modules split up further that much more tedious. Related
>> to that is the fact that anyone searching for these symbols now has more
>> modules to search through. So, finding symbols will be harder. Take
>> std.algorithm for instance. It was split, because it was getting large
>> enough that compiling it on machines without large amounts of memory
>> resulted in the compiler running out of memory. So, there was a very good
>> argument for splitting it. However, now, even if you know that a symbol is
>> in std.algorithm, do you know where in std.algorithm it is? Some are obvious
>> - e.g. sort is in std.algorithm.sorting. However, others are not so
>> obviously - e.g. where does startsWith live? Arguably, it could go in either
>> std.algorithm.comparison or std.algorithm.searching. It turns out that it's
>> in std.algorithm.searching, but I generally have to look it up. And where to
>> functions like map or filter live? std.algorithm.mutation?
>> std.algorithm.iteration? It's not necessarily obvious at all.
>>
>> From the perspective of users trying to find stuff, splitting modules up
>> comes at a real cost, and I honestly don't understand why some folks are in
>> a hurry to make module really small. That means more import statements when
>> using those modules, and it means that it's harder to find symbols.
>>
>> Personally, I think that we should be very slow to consider splitting
>> modules and only do so when it's clear that there's a real need, and
>> std.array is nowhere near that level.
>>
>> - Jonathan M Davis
>>
>>


Re: what are guidelines for when to split a module into a package?

2018-02-21 Thread Timothee Cour via Digitalmars-d
>  it's harder to find symbols

i don't understand this argument.

```
dscanner --declaration startsWith
./std/algorithm/searching.d(4105:6)
./std/algorithm/searching.d(4195:6)
./std/algorithm/searching.d(4265:6)
./std/algorithm/searching.d(4301:6)
```


On Wed, Feb 21, 2018 at 11:31 PM, Jonathan M Davis via Digitalmars-d
 wrote:
> On Wednesday, February 21, 2018 23:13:33 Timothee Cour via Digitalmars-d
> wrote:
>> from my perspective it makes sense to split a module M into submodules
>> A, B when:
>> * M is large
>> * there's little interaction between A and B (eg only few symbols from
>> A are needed in B and vice versa)
>> * A and B are logically grouped (that is domain specific)
>> * it doesn't turn into an extreme (1 function per module)
>>
>> Advantages of splitting:
>> * easier to review
>> * easier to edit (no need to scroll much to see entirety of module
>> we're editing)
>> * less pollution from top-level imports as they only affect submodule
>> (likewise with top-level attributes etc)
>> * more modular
>> * doesn't affect existing code since `import M` will continue to work
>> after M is split into a package
>> * less memory when using separate compilation
>> * allows fine-grained compiler options (eg we can compile B with `-O` if
>> needed) * allows to run unittests just for A instead of M
>> * allows selective import in client to avoid pulling in too many
>> dependencies (see arguments that were made for std.range.primitives)
>>
>> Disadvantages of splitting:
>> * more files; but not sure why that's a problem so long we don't go
>> into extremes (eg 1 function per module or other things of bad taste)
>>
>> ---
>> while working on https://github.com/dlang/phobos/pull/6178 I had
>> initially split M:std.array into submodules:
>> A:std.array.util (the old std.array) and B:std.array.static_array
>> (everything added in the PR)
>> IMO this made sense according to my above criteria (in this case there
>> was 0 interaction between A and B), but the reviewers disagreed with
>> the split.
>>
>> So, what are the guidelines?
>
> It's decided on a case-by-case basis but is generally only done if the
> module is quite large. std.array is not particularly large. It's less than
> 4000 lines, including unit tests and documentation, and it only has 18
> top-level symbols.
>
> Also, remember that within Phobos, imports are supposed to be as localized
> as possible - both in terms of where the import is placed and in terms of
> selective imports - e.g. it would be
>
> import std.algorithm.searching : find;
>
> not
>
> import std.algorithm : find;
>
> which means that splitting the module then requires that all of those
> imports be even more specific. User code can choose to do that or not, but
> it does make having modules split up further that much more tedious. Related
> to that is the fact that anyone searching for these symbols now has more
> modules to search through. So, finding symbols will be harder. Take
> std.algorithm for instance. It was split, because it was getting large
> enough that compiling it on machines without large amounts of memory
> resulted in the compiler running out of memory. So, there was a very good
> argument for splitting it. However, now, even if you know that a symbol is
> in std.algorithm, do you know where in std.algorithm it is? Some are obvious
> - e.g. sort is in std.algorithm.sorting. However, others are not so
> obviously - e.g. where does startsWith live? Arguably, it could go in either
> std.algorithm.comparison or std.algorithm.searching. It turns out that it's
> in std.algorithm.searching, but I generally have to look it up. And where to
> functions like map or filter live? std.algorithm.mutation?
> std.algorithm.iteration? It's not necessarily obvious at all.
>
> From the perspective of users trying to find stuff, splitting modules up
> comes at a real cost, and I honestly don't understand why some folks are in
> a hurry to make module really small. That means more import statements when
> using those modules, and it means that it's harder to find symbols.
>
> Personally, I think that we should be very slow to consider splitting
> modules and only do so when it's clear that there's a real need, and
> std.array is nowhere near that level.
>
> - Jonathan M Davis
>
>


Re: what are guidelines for when to split a module into a package?

2018-02-21 Thread Jonathan M Davis via Digitalmars-d
On Wednesday, February 21, 2018 23:13:33 Timothee Cour via Digitalmars-d 
wrote:
> from my perspective it makes sense to split a module M into submodules
> A, B when:
> * M is large
> * there's little interaction between A and B (eg only few symbols from
> A are needed in B and vice versa)
> * A and B are logically grouped (that is domain specific)
> * it doesn't turn into an extreme (1 function per module)
>
> Advantages of splitting:
> * easier to review
> * easier to edit (no need to scroll much to see entirety of module
> we're editing)
> * less pollution from top-level imports as they only affect submodule
> (likewise with top-level attributes etc)
> * more modular
> * doesn't affect existing code since `import M` will continue to work
> after M is split into a package
> * less memory when using separate compilation
> * allows fine-grained compiler options (eg we can compile B with `-O` if
> needed) * allows to run unittests just for A instead of M
> * allows selective import in client to avoid pulling in too many
> dependencies (see arguments that were made for std.range.primitives)
>
> Disadvantages of splitting:
> * more files; but not sure why that's a problem so long we don't go
> into extremes (eg 1 function per module or other things of bad taste)
>
> ---
> while working on https://github.com/dlang/phobos/pull/6178 I had
> initially split M:std.array into submodules:
> A:std.array.util (the old std.array) and B:std.array.static_array
> (everything added in the PR)
> IMO this made sense according to my above criteria (in this case there
> was 0 interaction between A and B), but the reviewers disagreed with
> the split.
>
> So, what are the guidelines?

It's decided on a case-by-case basis but is generally only done if the
module is quite large. std.array is not particularly large. It's less than
4000 lines, including unit tests and documentation, and it only has 18
top-level symbols.

Also, remember that within Phobos, imports are supposed to be as localized
as possible - both in terms of where the import is placed and in terms of
selective imports - e.g. it would be

import std.algorithm.searching : find;

not

import std.algorithm : find;

which means that splitting the module then requires that all of those
imports be even more specific. User code can choose to do that or not, but
it does make having modules split up further that much more tedious. Related
to that is the fact that anyone searching for these symbols now has more
modules to search through. So, finding symbols will be harder. Take
std.algorithm for instance. It was split, because it was getting large
enough that compiling it on machines without large amounts of memory
resulted in the compiler running out of memory. So, there was a very good
argument for splitting it. However, now, even if you know that a symbol is
in std.algorithm, do you know where in std.algorithm it is? Some are obvious
- e.g. sort is in std.algorithm.sorting. However, others are not so
obviously - e.g. where does startsWith live? Arguably, it could go in either
std.algorithm.comparison or std.algorithm.searching. It turns out that it's
in std.algorithm.searching, but I generally have to look it up. And where to
functions like map or filter live? std.algorithm.mutation?
std.algorithm.iteration? It's not necessarily obvious at all.

>From the perspective of users trying to find stuff, splitting modules up
comes at a real cost, and I honestly don't understand why some folks are in
a hurry to make module really small. That means more import statements when
using those modules, and it means that it's harder to find symbols.

Personally, I think that we should be very slow to consider splitting
modules and only do so when it's clear that there's a real need, and
std.array is nowhere near that level.

- Jonathan M Davis




Re: what are guidelines for when to split a module into a package?

2018-02-21 Thread rikki cattermole via Digitalmars-d

On 22/02/2018 8:13 PM, Timothee Cour wrote:

from my perspective it makes sense to split a module M into submodules
A, B when:
* M is large


Soft 1k, 2-3k hard limit.

At this point either your scope is good, or you need to subdivide it again.