Re: In general, who should do more work: popFront or front?

2021-06-15 Thread Steven Schveighoffer via Digitalmars-d-learn

On 6/15/21 12:24 AM, surlymoor wrote:
All my custom range types perform all their meaningful work in their 
respective popFront methods, in addition to its expected source data 
iteration duties. The reason I do this is because I swear I read in a 
github discussion that front is expected to be O(1), and the only way I 
can think to achieve this is to stash the front element of a range in a 
private field; popFront would thus also set this field to a new value 
upon every call, and front would forward to it. (Or front would be the 
cache itself.)
At the moment, I feel that as long as the stashed front element isn't 
too "big" (For some definition of big, I guess.), that built-in caching 
should be fine. But is this acceptable? What's the best practice for 
determining which range member should perform what work? (Other than 
iterating, of course.)


IMO, `front` should not be figuring out which element is next. That is 
the job of `popFront`.


But that doesn't mean `front` cannot do work. map is a prime example. If 
you use `map` though, you know what you are signing up for.


`front` is expected to be called multiple times to get the same data. 
One thing to think about is that there is a `cache` wrapper range which 
can store it for you. This way, you give your users the option of 
caching or not.


Each situation is different, and depends on the underlying mechanisms 
needed to make the range operate.


-Steve


Re: In general, who should do more work: popFront or front?

2021-06-15 Thread H. S. Teoh via Digitalmars-d-learn
On Tue, Jun 15, 2021 at 02:20:11PM +, Paul Backus via Digitalmars-d-learn 
wrote:
[...]
> It's a time-space tradeoff. As you say, caching requires additional
> space to store the cached element. On the other hand, *not* caching
> means that you spend unnecessary time computing the next element in
> cases where the range is only partially consumed. For example:
> 
> ```d
> import std.range: generate, take;
> import std.algorithm: each;
> import std.stdio: writeln;
> 
> generate!someExpensiveFunction.take(3).each!writeln;
> ```
> 
> Naively, you'd expect that `someExpensiveFunction` would be called 3
> times--but it is actually called 4 times, because `generate` does its
> work in its constructor and `popFront` instead of in `front`.

One way to address this is to make the computation lazy: the element is
only computed once on demand, and once computed it's cached.  But of
course, this is probably overkill on many ranges.

So the long answer is, it depends. :-)


T

-- 
It is not the employer who pays the wages. Employers only handle the money. It 
is the customer who pays the wages. -- Henry Ford


Re: In general, who should do more work: popFront or front?

2021-06-15 Thread Paul Backus via Digitalmars-d-learn

On Tuesday, 15 June 2021 at 04:24:09 UTC, surlymoor wrote:
All my custom range types perform all their meaningful work in 
their respective popFront methods, in addition to its expected 
source data iteration duties. The reason I do this is because I 
swear I read in a github discussion that front is expected to 
be O(1), and the only way I can think to achieve this is to 
stash the front element of a range in a private field; popFront 
would thus also set this field to a new value upon every call, 
and front would forward to it. (Or front would be the cache 
itself.)
At the moment, I feel that as long as the stashed front element 
isn't too "big" (For some definition of big, I guess.), that 
built-in caching should be fine. But is this acceptable? What's 
the best practice for determining which range member should 
perform what work? (Other than iterating, of course.)


It's a time-space tradeoff. As you say, caching requires 
additional space to store the cached element. On the other hand, 
*not* caching means that you spend unnecessary time computing the 
next element in cases where the range is only partially consumed. 
For example:


```d
import std.range: generate, take;
import std.algorithm: each;
import std.stdio: writeln;

generate!someExpensiveFunction.take(3).each!writeln;
```

Naively, you'd expect that `someExpensiveFunction` would be 
called 3 times--but it is actually called 4 times, because 
`generate` does its work in its constructor and `popFront` 
instead of in `front`.


Re: In general, who should do more work: popFront or front?

2021-06-15 Thread Ali Çehreli via Digitalmars-d-learn

On 6/14/21 10:17 PM, mw wrote:

> I think there is another convention (although it's not formally
> enforced, but should be) is:
>
> -- `obj.front() [should be] const`, i.e. it shouldn't modify `obj`, so
> can be called multiple times at any given state, and produce the same
> result

In other words, front() should be "idempotent".

To the OP, there is the following presentation that is related and 
touches on similar concerns:


  https://forum.dlang.org/thread/diexjstekiyzgxlic...@forum.dlang.org

Ali



Re: In general, who should do more work: popFront or front?

2021-06-15 Thread Mike Parker via Digitalmars-d-learn

On Tuesday, 15 June 2021 at 04:24:09 UTC, surlymoor wrote:


At the moment, I feel that as long as the stashed front element 
isn't too "big" (For some definition of big, I guess.), that 
built-in caching should be fine. But is this acceptable? What's 
the best practice for determining which range member should 
perform what work? (Other than iterating, of course.)


The "general rule" is that `front` should return the same result 
on multiple calls until after the next call to `popFront`. I 
don't know of any requirement that such an operation be O(1), but 
the expectation is certainly there. The implication then is that 
any necessary work should be carried out in `popFront` to advance 
the range (and additionally in a constructor if things need to be 
teed up first) and that `front` should just return the current 
element.




Re: In general, who should do more work: popFront or front?

2021-06-14 Thread ag0aep6g via Digitalmars-d-learn

On 15.06.21 07:17, mw wrote:

https://dlang.org/library/std/range/primitives/front.html
the 2nd decl:
dchar front(T) (
   scope const(T)[] a
) pure @property @safe
if (isAutodecodableString!(T[]));

you can see `const`


but

https://dlang.org/library/std/range/primitives/pop_front.html
void popFront(C) (
   scope ref inout(C)[] str
) pure nothrow @trusted
if (isAutodecodableString!(C[]));

it's `ref inout`, which means the passed-in object is intended to be 
modified inside the method in-place.


`const` and `inout` mean the same thing here: Both functions cannot 
modify the elements of the array.


The difference lies in `ref` alone. `front` receives a copy, so it can't 
change the length of the array you pass in. `popFront` receives by 
reference, so it can (and does).


Re: In general, who should do more work: popFront or front?

2021-06-14 Thread surlymoor via Digitalmars-d-learn

On Tuesday, 15 June 2021 at 05:17:06 UTC, mw wrote:
I think there is another convention (although it's not formally 
enforced, but should be) is:


-- `obj.front() [should be] const`, i.e. it shouldn't modify 
`obj`, so can be called multiple times at any given state, and 
produce the same result
-- `obj.popFront()` perform the actual mutation of the internal 
state the `obj`, each invocation will change the object's state 
once.


That's a separate matter. To use as an example once more, 
MapResult's front method produces the range's elements, and its 
invocation results in the identical element given that popFront 
isn't called; it doesn't affect the range's source. It satisfies 
those expectations, but it leaves wanting to know whether one 
should strive for front being inexpensive to call.
My original question is stupid, and I think I'm just going to 
have to experiment.


Re: In general, who should do more work: popFront or front?

2021-06-14 Thread mw via Digitalmars-d-learn

On Tuesday, 15 June 2021 at 05:03:45 UTC, surlymoor wrote:

On Tuesday, 15 June 2021 at 04:57:45 UTC, mw wrote:
In English, `front` is a noun, `popFront` have a verb in it, 
so you know who should do more work :-)


Sure, but, for example, the front method of Phobos' MapResult 
is the one applying the predicate to the source's front. 
There's a clear separation of responsibilities between popFront 
and front: the former iterates over the source, and the latter 
performs an operation that produces the range's elements.



I think there is another convention (although it's not formally 
enforced, but should be) is:


-- `obj.front() [should be] const`, i.e. it shouldn't modify 
`obj`, so can be called multiple times at any given state, and 
produce the same result


-- `obj.popFront()` perform the actual mutation of the internal 
state the `obj`, each invocation will change the object's state 
once.



e.g.

https://dlang.org/library/std/range/primitives/front.html
the 2nd decl:
dchar front(T) (
  scope const(T)[] a
) pure @property @safe
if (isAutodecodableString!(T[]));

you can see `const`


but

https://dlang.org/library/std/range/primitives/pop_front.html
void popFront(C) (
  scope ref inout(C)[] str
) pure nothrow @trusted
if (isAutodecodableString!(C[]));

it's `ref inout`, which means the passed-in object is intended to 
be modified inside the method in-place.




Re: In general, who should do more work: popFront or front?

2021-06-14 Thread surlymoor via Digitalmars-d-learn

On Tuesday, 15 June 2021 at 04:57:45 UTC, mw wrote:
In English, `front` is a noun, `popFront` have a verb in it, so 
you know who should do more work :-)


Sure, but, for example, the front method of Phobos' MapResult is 
the one applying the predicate to the source's front. There's a 
clear separation of responsibilities between popFront and front: 
the former iterates over the source, and the latter performs an 
operation that produces the range's elements.


Re: In general, who should do more work: popFront or front?

2021-06-14 Thread mw via Digitalmars-d-learn

On Tuesday, 15 June 2021 at 04:24:09 UTC, surlymoor wrote:
All my custom range types perform all their meaningful work in 
their respective popFront methods, in addition to its expected 
source data iteration duties. The reason I do this is because I 
swear I read in a github discussion that front is expected to 
be O(1), and the only way I can think to achieve this is to 
stash the front element of a range in a private field; popFront 
would thus also set this field to a new value upon every call, 
and front would forward to it. (Or front would be the cache 
itself.)
At the moment, I feel that as long as the stashed front element 
isn't too "big" (For some definition of big, I guess.), that 
built-in caching should be fine. But is this acceptable? What's 
the best practice for determining which range member should 
perform what work? (Other than iterating, of course.)


In English, `front` is a noun, `popFront` have a verb in it, so 
you know who should do more work :-)


Re: In general, who should do more work: popFront or front?

2021-06-14 Thread surlymoor via Digitalmars-d-learn

On Tuesday, 15 June 2021 at 04:43:38 UTC, jfondren wrote:

Well, consider this program:

```d
import std;

struct Noisy {
int[] source;
int pops, fronts;
bool empty() { return source.empty; }
void popFront() { writeln("popFront #", ++pops); 
source.popFront; }
int front() { writeln("front #", ++fronts); return 
source.front; }

}

void main() {
iota(5).array
.Noisy
.filter!"a%2"
.each!writeln;
}
```

Out of [0,1,2,3,4], only 1,3 pass the filter.
Noisy's front is called seven times, 2x for each filter success.
Noisy's popFront is called five times, 1x for each source 
member.


But if you slap a .cache (from std.algorithm.cache) before the
.filter then these counts are the same.


Appreciate the response.
From your example, one solution is that perhaps documenting one's 
custom range type with a disclaimer that its front method is 
expensive upon invocation is reasonable; the consumer of the 
range type, given this information, might then determine whether 
using cache is appropriate for them.
I'm still left wondering whether one should strive for ensuring 
that front is O(1), and various solutions to this. With that 
said, I now realize my question is kind of ridiculous since the 
answer might be predicated upon other aspects: the type data 
being consumed; the common operations that are to be performed 
upon a custom range.


Re: In general, who should do more work: popFront or front?

2021-06-14 Thread jfondren via Digitalmars-d-learn

On Tuesday, 15 June 2021 at 04:24:09 UTC, surlymoor wrote:
All my custom range types perform all their meaningful work in 
their respective popFront methods, in addition to its expected 
source data iteration duties. The reason I do this is because I 
swear I read in a github discussion that front is expected to 
be O(1), and the only way I can think to achieve this is to 
stash the front element of a range in a private field; popFront 
would thus also set this field to a new value upon every call, 
and front would forward to it. (Or front would be the cache 
itself.)
At the moment, I feel that as long as the stashed front element 
isn't too "big" (For some definition of big, I guess.), that 
built-in caching should be fine. But is this acceptable? What's 
the best practice for determining which range member should 
perform what work? (Other than iterating, of course.)


Well, consider this program:

```d
import std;

struct Noisy {
int[] source;
int pops, fronts;
bool empty() { return source.empty; }
void popFront() { writeln("popFront #", ++pops); 
source.popFront; }
int front() { writeln("front #", ++fronts); return 
source.front; }

}

void main() {
iota(5).array
.Noisy
.filter!"a%2"
.each!writeln;
}
```

Out of [0,1,2,3,4], only 1,3 pass the filter.
Noisy's front is called seven times, 2x for each filter success.
Noisy's popFront is called five times, 1x for each source member.

But if you slap a .cache (from std.algorithm.cache) before the
.filter then these counts are the same.


In general, who should do more work: popFront or front?

2021-06-14 Thread surlymoor via Digitalmars-d-learn
All my custom range types perform all their meaningful work in 
their respective popFront methods, in addition to its expected 
source data iteration duties. The reason I do this is because I 
swear I read in a github discussion that front is expected to be 
O(1), and the only way I can think to achieve this is to stash 
the front element of a range in a private field; popFront would 
thus also set this field to a new value upon every call, and 
front would forward to it. (Or front would be the cache itself.)
At the moment, I feel that as long as the stashed front element 
isn't too "big" (For some definition of big, I guess.), that 
built-in caching should be fine. But is this acceptable? What's 
the best practice for determining which range member should 
perform what work? (Other than iterating, of course.)