Re: Suggested magic for a .. b

2010-07-17 Thread Aaron Sherman
On Fri, Jul 16, 2010 at 9:40 PM, Michael Zedeler mich...@zedeler.dk wrote:


 What started it all, was the intention to extend the operator, making it
 possible to evaluate it in list context. Doing so has opened pandoras box,
 because many (most? all?) solutions are inconsistent with the rule of least
 surprise.


I don't think there's any coherent expectation, and therefore no potential
to avoid surprise. Returning comic books might be more of a surprise, but as
long as you're returning a string which appears to be in the range
expressed, then I don't see surprise as the problem.



 For instance, when considering strings, writing up an expression like

 'goat' ~~ 'cow' .. 'zebra'

 This makes sense in most cases, because goat is lexicographically between
 cow and zebra.


This presumes that we're treating a string as a number in base x (where x,
I guess would be the number of code points which share ... what, any of the
general category properties of the components of the input strings?

That begins to get horrendously messy very, very fast:

 say 1aB .. aB1



 I'd suggest that if you want to evaluate a Range in list context, you may
 have to provide a hint to the Range generator, telling it how to generate
 subsequent values. Your suggestion that the expansion of 'Ab' ..  'Be'
 should yield Ab Ac Ad Ae Bb Bc Bd Be is just an example of a different
 generator (you could call it a different implementation of ++ on Str types).
 It does look useful, but by realizing that it probably is, we have two
 candidates for how Ranges should evaluate in list context.


I think the solution here is to evaluate what's practical in the general
case. Your examples are, I think misleading because they involve English
words and we naturally leap to sure, that one's in the dictionary between
the other two. However, let me pose this dictionary lookup for you:

 cliché ~~ aphorism .. truth

Now, you see where this is going? What happens when we throw in some
punctuation?

 father-in-law ~~ dad .. stranger

The problem is that you have a complex heuristic in mind for determining
membership, and a very simple operator for expressing the set. Worse, I
haven't even gotten into dealing with Unicode where it's entirely reasonable
to write TOPIXコンポジット1500構成銘柄 which I shamelessly grabbed from a Tokyo
Stock Exchange page. That one string, used in everyday text, contains Latin
letters, Hiragana, Katakana, Han or Kanji idiograms and Latin digits.

Meanwhile, back to .. ... the range operator. The most useful application
that I can think of for strings of length  1 is for generating unique
strings such as for mktemp.

Beyond that, its application is actually quite limited, because the rules
for any other sort of string that might make sense to a human are absurdly
complex.

As such, I think it suffices to say that, for the most part, .. makes
sense for single-character strings, and to expand from there, rather than
trying to introduce anything more complex.

-- 
Aaron Sherman
Email or GTalk: a...@ajs.com
http://www.ajs.com/~ajs


Re: Suggested magic for a .. b

2010-07-17 Thread Ruud H.G. van Tol

Aaron Sherman wrote:


Having established this range for each correspondingly indexed letter, the
range for multi-character strings is defined by a left-significant counting
sequence. For example:

Ab .. Be

defines the ranges:

A B and b c d e

This results in a counting sequence (with the most significant character on
the left) as follows:

Ab Ac Ad Ae Bb Bc Bd Be


glob can do that:

perl5.8.5 -wle 'print for {A,B}{c,d,e}'
Ac
Ad
Ae
Bc
Bd
Be



Currently, Rakudo produces this:

Ab, Ac, Ad, Ae, Af, Ag, Ah, Ai, Aj, Ak, Al, Am,
An, Ao, Ap, Aq, Ar, As, At, Au, Av, Aw, Ax, Ay,
Az, Ba, Bb, Bc, Bd, Be

which I don't think is terribly useful.


Good enough for me. For your variant, just override the .. for 'smarter' 
behavior?


--
Ruud