On 2010-07-16 18:40, Aaron Sherman wrote:
Oh bother, I wrote this up last night, but forgot to send it. Here y'all go:

I've been testing ".." recently, and it seems, in Rakudo, to behave like
Perl 5. That is, the magic auto-increment for "a" .. "z" works very wonkily,
given any range that isn't within some very strict definitions (identical
Unicode general category, increasing, etc.) So the following:

"A" .. "z"

produces very odd results.

I'd like to suggest that we re-define this operator on strings as follows:

[cut]

"Ab" .. "Be"

defines the ranges:

<A B>  and<b c d e>

This results in a counting sequence (with the most significant character on
the left) as follows:

<Ab Ac Ad Ae Bb Bc Bd Be>

Currently, Rakudo produces this:

"Ab", "Ac", "Ad", "Ae", "Af", "Ag", "Ah", "Ai", "Aj", "Ak", "Al", "Am",
"An", "Ao", "Ap", "Aq", "Ar", "As", "At", "Au", "Av", "Aw", "Ax", "Ay",
"Az", "Ba", "Bb", "Bc", "Bd", "Be"

which I don't think is terribly useful.
I have been discussing the Range operator before on this list, and since it often becomes the topic of discussion, something must be wrong with it.

What started it all, was the intention to extend the operator, making it possible to evaluate it in list context. Doing so has opened pandoras box, because many (most? all?) solutions are inconsistent with the rule of least surprise.

For instance, when considering strings, writing up an expression like

'goat' ~~ 'cow' .. 'zebra'

This makes sense in most cases, because goat is lexicographically between cow and zebra. So we have a nice ordering of strings that even extends to strings of any length (note that the three words used in my example are 3, 4 and 5 letters). As you can see, we even have a Range operator in there, so everything should be fine. What breaks everything is that we expect the Range operator to be able to generate all values between the two provided endpoints. Everything goes downhill from there.

With regard to strings, lexicographical ordering is the only prevailing ordering we provide the developer with (apart from length which doesn't provide a strict ordering that is needed). So anyone using the Range operator would assume that when lexicographical ordering is used for Range membership test, it is also used for generation of its members, naturally leading to the infinite sequence

cow
cowa
cowaa
cowaaa
...
cowb
cowba
cowbaa

For some reason (even though Perl6 supports infinite lists) we are currently using a completely new construct: the domain of strings limited to the lenght of the longest operand. This is counter intuitive since

'cowbell' ~~ 'cow' .. 'zebra'

but

'cow' .. 'zebra'

does not produce 'cowbell' in list context.

Same story applies to other types that come with a natural ordering, but have an over countable domain. Although the solutions differ, the main problem is the same - they all behave counter intuitive.

5.0001 ~~ 1.1 .. 10.1

but

1.1 .. 10.1

does not (and really shouldn't!) produce 5.0001 in list context.

I'd suggest that if you want to evaluate a Range in list context, you may have to provide a hint to the Range generator, telling it how to generate subsequent values. Your suggestion that the expansion of 'Ab' .. 'Be' should yield <Ab Ac Ad Ae Bb Bc Bd Be> is just an example of a different generator (you could call it a different implementation of ++ on Str types). It does look useful, but by realizing that it probably is, we have two candidates for how Ranges should evaluate in list context.

The same applies to Numeric types.

My suggestion is to eliminate the succ method on Rat, Complex, Real and Str and point people in the direction of the series operator if they need to generate sequences of things that are over countable.

Regards,

Michael.

Reply via email to