Re: Range length property

Steven Schveighoffer via Digitalmars-d-learn Tue, 10 Apr 2018 16:46:48 -0700

On 4/10/18 6:07 PM, Cym13 wrote:

On Tuesday, 10 April 2018 at 20:08:14 UTC, Jonathan M Davis wrote:
On Tuesday, April 10, 2018 19:47:10 Nordlöw via Digitalmars-d-learnwrote:
On Tuesday, 10 April 2018 at 14:34:40 UTC, Adam D. Ruppe wrote:
> On Tuesday, 10 April 2018 at 14:25:52 UTC, Nordlöw wrote:
>> Should ranges always provide a length property?
>
> No.
>
>> If so, in which cases is a length property an advantage or >> arequirement?
>
> Just provide it whenever it is cheap to do so. If you need > to docomplex calculations or especially loop over contents > to figure outthe length, do NOT provide it.
>
> But if it is as simple as returning some value, provide it > andalgorithms can take advantage of it for optimizations > etc. as needed.
I'm thinking of my own container Hashmap having its range ByKeyValuerequiring one extra word of memory to store the iteration countwhich, in turn, can be used to calculate the length of the remainingrange. Is this motivated?
That would depend entirely on what you're trying to do, but ingeneral, if a range has length, then some algorithms will be moreefficient, and some algorithms do require length. So, if you canprovide length, then the range will be more useful, just like abidirectional range can be more useful than a forward range or arandom-access range can be more useful than either. However, if you'renot doing anything that ever benefits from it having length, then itdoesn't buy you anything. So, it ultimately depends on what you'redoing. In a general purpose library, I'd say that it should havelength if it can do so in O(1), but if it's just for you, then it mayor may not be worth it.
The other thing to consider is what happens when the container ismutated. I don't think that ranges necessarily behave all that wellwhen an underlying container is mutated, but it is something that hasto be considered when dealing with a range over a container. Even ifmutating the underlying container doesn't necessarily invalidate arange, maintaining the length in the manner that you're suggestingprobably makes it so that it would be invalidated in more cases, sinceif any elements are added or removed in the portion that was alreadypopped off the range, then the iteration count couldn't be used tocalculate the length in the same way anymore. Now, with a hash map,the range is probably fully invalidated when anything gets added orremoved anyway, since that probably screws with the order of theelements in the range, but how the range is going to behave when theunderlying container is mutated and how having the length propertydoes or doesn't affect that is something that you'll need to consider.
- Jonathan M Davis
I find that discussion very interesting as I had never considered thatbecause of design by introspection having a costly length method wouldlead to unexpected calls by generic algorithms making it a disadventageif present.
On the other hand I don't think the end user should have to scratch hishead to find the length of a range, especially if it's not trivial toget (say, O(log n) kind of case). Therefore exposing a method in anycase seems the best from an API perspective.

O(lg n) is fine for .length, it doesn't need to be O(1). It just can'tbe O(n). I think we established that "fast" operations are O(lg n) orbetter.

That being said, I don't know of a use case where you can get the lengthin O(lg n). It's usually O(1) or O(n).

But to avoid the performance issues mentionned earlier it means itshould bear a different name (get/setLength comes to mind). I believethis is the same kind of issue that lead to having "in" for associativearrays but not regular ones. However this also leads to less coherentAPIs in contradiction with the principle of least surprise.

It's definitely a tradeoff. It pushes some implementation details to theuser, but it also makes the runtime complexity more predictable.

In retrospect since only "unexpected" calls to such methods cause theissue I wonder if it wouldn't be best to have an UDA saying "Hey,please, this method is costly, if you're a generic template performingintrospection you should probably not call me". And writing thatAndrei's work on complexity annotations comes to mind. Anyway, I don'tthink the user should use different names just to alleviate an issue onthe library side but the alternative would be costly to put in place...

Potentially, but remember at the time length and walkLength wereconceived, UDA's didn't exist!

Using UDAs would also have the unfortunate side effect of eliminatingself-documentation. When you see walkLength right now, you know it's"slow", when you see length, you know "fast". If you have to look atUDAs to figure that out, then reading the code is that much harder.


-Steve

Re: Range length property

Reply via email to