subject:"Re\: Suggested magic for \"a\" .. \"b\""


On 2010-07-28 06:54, Martin D Kealey wrote:

On Wed, 28 Jul 2010, Michael Zedeler wrote:
   

Writing for ($a .. $b).reverse -  $c { ...} may then blow up because it
turns out that $b doesn't have a .succ method when coercing to sequence
(where the LHS must have an initial value), just like
 for $a .. $b -  $c { ... }
should be able to blow up because the LHS of a Range shouldn't have to
support .succ.
 

Presumably you'd only throw that except if, as well, $b doesn't support .pred ?
   
Yes. It should be .pred. So ($a .. $b).reverse is only possible if 
$b.pred is defined and $a.gt is defined (and taking an object that has 
the type of $b.pred). If the coercion to Sequence is taking place first, 
we'll have to live with two additional constraints ($b.lt and $a.succ), 
but I guess it would be easy to overload .reverse and get rid of those.


Regards,

Michael.

Re: Suggested magic for a .. b


Michael Zedeler wrote:
This is exactly why I keep writing posts about Ranges being defunct as 
they have been specified now. If we accept the premise that Ranges are 
supposed to define a kind of linear membership specification between two 
starting points (as in math), it doesn't make sense that the LHS has an 
additional constraint (having to provide a .succ method). All we should 
require is that both endpoints supports comparison (that they share a 
common type with comparison, at least).


Yes, I agree 100%.  All that should be required to construct a range 
$foo..$bar is that the endpoints are comparable, meaning $foo cmp $bar 
works.  Having a .pred or .succ for $foo|$bar should not be required to define a 
range but only to use that range as a generator. -- Darren Duncan

Re: Suggested magic for a .. b


Michael Zedeler wrote:

This is exactly why I keep writing posts about Ranges being defunct as 
they have been specified now. If we accept the premise that Ranges are 
supposed to define a kind of linear membership specification between two 
starting points (as in math), it doesn't make sense that the LHS has an 
additional constraint (having to provide a .succ method). All we should 
require is that both endpoints supports comparison (that they share a 
common type with comparison, at least).


To squint at this slightly, in the context that we already have 0...1e10 
as a sequence generator, perhaps the semantics of iterating a range 
should be unordered -- that is,


  for 0..10 - $x { ... }

is treated as

  for (0...10).pick(*) - $x { ... }

Then the whole question of reversibility is moot. Plus, there would then 
be useful distinction for serialization of C.. Vs C (perhaps we 
should even parallelize) When you have two very similar operators it's 
often good to maximize the semantic distance between them so that people 
don't get into the lazy habit of using them without thinking.

Re: Suggested magic for a .. b

Dave Whipp wrote:
 To squint at this slightly, in the context that we already have 0...1e10 as
 a sequence generator, perhaps the semantics of iterating a range should be
 unordered -- that is,

  for 0..10 - $x { ... }

 is treated as

  for (0...10).pick(*) - $x { ... }

 Then the whole question of reversibility is moot.

No thanks; I'd prefer it if $a..$b have analogous meanings in item and
list contexts.  As things stand, 10..1 means, in item context,
numbers that are greater or equal to ten and less than or equal to
one, which is equivalent to nothing; in list context, it means an
empty list. This makes sense to me; having it provide a list
containing the numbers 1 through 10 creates a conflict between the two
contexts regardless of how they're arranged.

As I see it, C $a..$b  in list context is a useful shorthand for C
$a, *.succ ... $b .  You only get into trouble when you start trying
to have infix:.. do more than that in list context.

If anything needs to be done with respect to infix:.., it lies in
changing the community perception of the operator.  The only reason
why we're having this debate at all is that in Perl 5, the .. operator
was used to generate lists; so programmers coming from Perl 5 start
with the expectation that that's what it's for in Perl 6, too.  That
expectation needs to be corrected as quickly as can be managed, not
catered to.  But that's not a matter of language design; it's a matter
to be addressed by whoever's going to be writing the Perl 6 tutorials.

-- 
Jonathan Dataweaver Lang

Re: Suggested magic for a .. b

2010-07-28 Thread Moritz Lenz

Dave Whipp wrote:
 To squint at this slightly, in the context that we already have 0...1e10 
 as a sequence generator, perhaps the semantics of iterating a range 
 should be unordered -- that is,
 
for 0..10 - $x { ... }
 
 is treated as
 
for (0...10).pick(*) - $x { ... }

Sorry, I have to ask. Are you serious? Really?

Cheers,
Moritz

Re: Suggested magic for a .. b

2010-07-28 Thread yary

On Wed, Jul 28, 2010 at 8:34 AM, Dave Whipp d...@dave.whipp.name wrote:
 To squint at this slightly, in the context that we already have 0...1e10 as
 a sequence generator, perhaps the semantics of iterating a range should be
 unordered -- that is,

  for 0..10 - $x { ... }

 is treated as

  for (0...10).pick(*) - $x { ... }

Makes me think about parallel operations.

for 0...10 - $x { ... } # 0 through 10 in order
for 0..10 - $x { ... } # Spawn 11 threads, $x=0 through 10 concurrently
for 10..0 - $x { ... } # A no-op
for 10...0 - $x { ... } # 10 down to 0 in order

though would a parallel batch of an anonymous block be more naturally written as
all(0...10) - $x { ... } # Spawn 11 threads

-y

Re: Suggested magic for a .. b

2010-07-28 Thread Moritz Lenz

yary wrote:
 though would a parallel batch of an anonymous block be more naturally written 
 as
 all(0...10) - $x { ... } # Spawn 11 threads

No,

hyper  for 0..10 - $x { ... } # spawn as many threads
# as the compiler thinks are reasonable

I think one (already specced) syntax for the same thing is enough,
especially considering that hyper operators also do the same job.

Cheers,
Moritz

Re: Suggested magic for a .. b

2010-07-28 Thread TSa (Thomas Sandlaß)

On Wednesday, 28. July 2010 05:12:52 Michael Zedeler wrote:
 Writing ($a .. $b).reverse doesn't make any sense if the result were a
 new Range, since Ranges should then only be used for inclusion tests (so
 swapping endpoints doesn't have any meaningful interpretation), but
 applying .reverse could result in a coercion to Sequence.

Swapping the endpoints could mean swapping inside test to outside
test. The only thing that is needed is to swap from  to ||:

   $a .. $b   # means  $a = $_  $_ = $b  if $a  $b
   $b .. $a   # means  $b = $_ || $_ = $a  if $a  $b

Regards TSa.
-- 
The unavoidable price of reliability is simplicity -- C.A.R. Hoare
Simplicity does not precede complexity, but follows it. -- A.J. Perlis
1 + 2 + 3 + 4 + ... = -1/12  -- Srinivasa Ramanujan

Re: Suggested magic for a .. b

2010-07-28 Thread yary

 Swapping the endpoints could mean swapping inside test to outside
 test. The only thing that is needed is to swap from  to ||:

 $a .. $b # means $a = $_  $_ = $b if $a  $b
 $b .. $a # means $b = $_ || $_ = $a if $a  $b

I think that's what not, ! are for!

Re: Suggested magic for a .. b

TSa wrote:
 Swapping the endpoints could mean swapping inside test to outside
 test. The only thing that is needed is to swap from  to ||:

   $a .. $b   # means  $a = $_  $_ = $b  if $a  $b
   $b .. $a   # means  $b = $_ || $_ = $a  if $a  $b

This is the same sort of discontinuity of meaning that was causing
problems with Perl 5's use of negative indices to count backward from
the end of a list; there's a reason why Perl 6 now uses the [*-$a]
notation for that sort of thing.

Consider a code snippet where the programmer is given two values: one
is a minimum value which must be reached; the other is a maximum value
which must not be exceeded.  In this example, the programmer does not
know what the values are; for all he knows, the minimum threshold
exceeds the maximum.  As things stand, it's trivial to test whether or
not your sample value is viable: if $x ~~ $min .. $max, then you're
golden: it doesn't matter what $min cmp $max is.  With your change,
I'd have to replace the above with something along the lines of:
  if $min = $max  $x ~~ $min .. $max { ... } - because if $min 
$max, the algorithm will accept values that are well below the minimum
as well as values that are well above the maximum.

Keep it simple, folks!  There are enough corner cases in Perl 6 as
things stand; we don't need to be introducing more of them if we can
help it.

-- 
Jonathan Dataweaver Lang

Re: Suggested magic for a .. b

2010-07-28 Thread Mark J. Reed

On Wednesday, July 28, 2010, Jon Lang datawea...@gmail.com wrote:
 Keep it simple, folks!  There are enough corner cases in Perl 6 as
 things stand; we don't need to be introducing more of them if we can
 help it.

Can I get an Amen?  Amen!


-- 
Mark J. Reed markjr...@gmail.com

Re: Suggested magic for a .. b

2010-07-28 Thread Mark J. Reed

On Wed, Jul 28, 2010 at 2:30 PM, Chris Fields cjfie...@illinois.edu wrote:
 On Jul 28, 2010, at 1:27 PM, Mark J. Reed wrote:
 Can I get an Amen?  Amen!
 --
 Mark J. Reed markjr...@gmail.com

 +1.  I'm agnostic ;

Militant?  :)  ( http://tinyurl.com/3xjgxnl )

Nothing inherently religious about amen (or me), but I'll accept
+1 as synonymous.   :)

-- 
Mark J. Reed markjr...@gmail.com

Re: Suggested magic for a .. b


Moritz Lenz wrote:

Dave Whipp wrote:

   for 0..10 - $x { ... }
is treated as
   for (0...10).pick(*) - $x { ... }


Sorry, I have to ask. Are you serious? Really?


Ah, to reply, or not to reply, to rhetorical sarcasm ... In this case, I 
think I will:


Was my specific proposal entirely serious: only in that it was an 
attempt to broaden the box for the discussion of semantics of coercion 
ranges. One of the banes of my life is to undo the sequential mindset 
that so many programmers have. I like to point out that 
sequentialization is an optimization to make programs run faster on 
Von-Neumann architectures. Often, it's premature. Most of the time it 
doesn't matter (compilers, and even HW, can extract ILP), but every now 
and again it results in an unfortunate barrier in solution-space.


Why do we assume that ranges iterate in .succ order -- or even that they 
iterate as integers (and are finite). Why not iterate as a top-down 
breadth-first generation of a Cantor set? etc. Does the language need to 
choose a default, or is it better require the programmer to state how 
they want to coerce the range to the seq. Ten years from now, we'll keep 
needing to refer questions to the .. Vs ... faq.

Re: Suggested magic for a .. b

2010-07-28 Thread Moritz Lenz

Dave Whipp wrote:
 Moritz Lenz wrote:
 Dave Whipp wrote:
for 0..10 - $x { ... }
 is treated as
for (0...10).pick(*) - $x { ... }
 
 Sorry, I have to ask. Are you serious? Really?
 
 Ah, to reply, or not to reply, to rhetorical sarcasm ... In this case, I 
 think I will:

No sarcasm involved, just curiosity.

 Was my specific proposal entirely serious: only in that it was an 
 attempt to broaden the box for the discussion of semantics of coercion 
 ranges.

I fear what Perl 6 needs is not to broaden the range of discussion even
further, but to narrow it down to the essential points. Personal opinion
only.

 Why do we assume that ranges iterate in .succ order -- or even that they 
 iterate as integers (and are finite). Why not iterate as a top-down 
 breadth-first generation of a Cantor set?

That's easy: Principle of least surprise.

Cheers.
Moritz

Re: Suggested magic for a .. b


Moritz Lenz wrote:


I fear what Perl 6 needs is not to broaden the range of discussion even
further, but to narrow it down to the essential points. Personal opinion
only.


OK, as a completely serious proposal, the semantics of for 0..10 { ... 
} should be for the compiler to complain sorry, that's a perl5ism: in 
perl6, please use a C... or explicit coercion of the range to a sequence.



(BTW, I thought a bit more about my previous suggestion: there is 
precedent in that %hash.keys is unordered -- so it's not entirely 
obvious that a default range coercion should be ordered)

Re: Suggested magic for a .. b

2010-07-28 Thread Aaron Sherman

On Wed, Jul 28, 2010 at 11:34 AM, Dave Whipp d...@dave.whipp.name wrote:

 To squint at this slightly, in the context that we already have 0...1e10 as
 a sequence generator, perhaps the semantics of iterating a range should be
 unordered -- that is,

  for 0..10 - $x { ... }

 is treated as

  for (0...10).pick(*) - $x { ... }


As others have pointed out, this has some problems. You can't implement 0..*
that way, just for starters.


 Then the whole question of reversibility is moot.


Really? I don't think it is. In fact, you've simply made the problem pop up
everywhere, and guaranteed that .. must behave totally unlike any other
iterator.

Getting back to 10..0...

The complexity of implementation argument doesn't really hold for me, as:

   (a..b).list = ab ?? a,*.pred ... b !! a,*.succ ... b

Is pretty darned simple and does not require that b implement anything more
than it does under the current implementation. a, on the other hand, now has
to (optionally, since throwing an exception is the alternative) implement
one more method.

The more I look at this, the more I think .. and ... are reversed. ..
has a very specific and narrow usage (comparing ranges) and ... is
probably going to be the most broadly used operator in the language outside
of quotes, commas and the basic, C-derived math and logic ops. Many (most?)
loops will involve  Most array initializers will involve  Why
are we not calling that ..? Just because we defined .. first, and it
grandfathered its way in the door? Because it resembles the math op? These
don't seem like good reasons.

-- 
Aaron Sherman
Email or GTalk: a...@ajs.com
http://www.ajs.com/~ajs

Re: Suggested magic for a .. b

2010-07-28 Thread yary

On Wed, Jul 28, 2010 at 2:29 PM, Aaron Sherman a...@ajs.com wrote:

 The more I look at this, the more I think .. and ... are reversed. ..
 has a very specific and narrow usage (comparing ranges) and ... is
 probably going to be the most broadly used operator in the language outside
 of quotes, commas and the basic, C-derived math and logic ops.

+1

Though it being the day before Rakudo *'s first release makes me
think, too late!

-y

Re: Suggested magic for a .. b

2010-07-28 Thread Leon Timmermans

On Wed, Jul 28, 2010 at 11:29 PM, Aaron Sherman a...@ajs.com wrote:
 The more I look at this, the more I think .. and ... are reversed. ..
 has a very specific and narrow usage (comparing ranges) and ... is
 probably going to be the most broadly used operator in the language outside
 of quotes, commas and the basic, C-derived math and logic ops. Many (most?)
 loops will involve  Most array initializers will involve  Why
 are we not calling that ..? Just because we defined .. first, and it
 grandfathered its way in the door? Because it resembles the math op? These
 don't seem like good reasons.

I was thinking the same. Switching them seems better from a huffmanization POV.

Leon

Re: Suggested magic for a .. b


Aaron Sherman wrote:

The more I look at this, the more I think .. and ... are reversed. ..
has a very specific and narrow usage (comparing ranges) and ... is
probably going to be the most broadly used operator in the language outside
of quotes, commas and the basic, C-derived math and logic ops. Many (most?)
loops will involve  Most array initializers will involve  Why
are we not calling that ..? Just because we defined .. first, and it
grandfathered its way in the door? Because it resembles the math op? These
don't seem like good reasons.


I would rather that .. stay with intervals and ... with generators.  The 
mnemonics make more sense that way.  Having .. resemble the math op with the 
same meaning, intervals, is a good thing.  Besides comparing ranges, an interval 
would also often be used for a membership test, eg $a = $x = $b would 
alternately be spelled $x ~~ $a..$b for example.  I would imagine that the 
interval use would be more common than the generator use in some problem 
domains. -- Darren Duncan

Re: Suggested magic for a .. b


Darren Duncan wrote:

Aaron Sherman wrote:
The more I look at this, the more I think .. and ... are reversed. 

snip
I would rather that .. stay with intervals and ... with generators.  

snip

Another thing to consider if one is looking at huffmanization is how often the 
versions that exclude endpoints would be used, such as ^..^.


I would imagine that a sequence generator would also have this variability 
useful.

Does ... also come with the 4 variations of endpoint inclusion/exclusion?

If not, then it should, as I'm sure many times one would want to do this, say:

  for 0...^$n - {...}

In any event, I still think that the mnemonics of ... (yadda-yadda-yadda) are 
more appropriate to a generator, where it says produce this and so on.  A .. 
does not have that mnemonic and looks better for an interval.


-- Darren Duncan

Re: Suggested magic for a .. b


Aaron Sherman wrote:

On Wed, Jul 28, 2010 at 11:34 AM, Dave Whipp d...@dave.whipp.name wrote:


To squint at this slightly, in the context that we already have 0...1e10 as
a sequence generator, perhaps the semantics of iterating a range should be
unordered -- that is,

 for 0..10 - $x { ... }

is treated as

 for (0...10).pick(*) - $x { ... }



As others have pointed out, this has some problems. You can't implement 0..*
that way, just for starters.


I'd say that' a point in may favor: it demonstrates the integers and 
strings have similar problems. If you pick items from an infinite set 
then every item you pick will have an infinite number of digits/characters.


In smart-match context, a..b includes aardvark. It follows that, 
unless you're filtering/shaping the sequence of generated items, then 
almost every element (a..b).Seq starts with an infinite number of as.


Consistent semantics would make a..b very not-useful when used as a 
sequence: the user needs to say how they want to avoid the infinities. 
Similarly (0..1).Seq should most likely return Real numbers -- and thus 
(0..1).pick(*) can be approximated by (0..1).pick(*, :replace), which is 
much easier to implement.


So either you define some arbitrary semantics (what those should be is, 
I think, the original topic of this thread) or else you punt (error 
message). An error message has the advantage that you can always do 
something useful, later.



Then the whole question of reversibility is moot.

Really? I don't think it is. In fact, you've simply made the problem pop up
everywhere, and guaranteed that .. must behave totally unlike any other
iterator.


%hash.keys has similarly unordered semantics. Therefore 
%hash.keys.reverse is, for most purposes, equivalent to %hash.keys. That 
is why I said the question of reversibility becomes moot if you define 
the collapse of a range to a sequence to be unordered. It also 
demonstrates precedent, so not totally unlike any other.


Even though it was only a semi-serious proposal, I seem to find myself 
defending it. So maybe I was serious, afterall. That argument for DWIM 
being ordered pretty much goes away once you tell people to use ... 
for what they intended to mean.




Getting back to 10..0


Yes, I agree with Jon that this should be an empty range. I don't care 
what order you pick the elements from an empty range :).

Re: Suggested magic for a .. b


Dave Whipp wrote:

Similarly (0..1).Seq should most likely return Real numbers


No it shouldn't, because the endpoints are integers.

If you want Real numbers, then say 0.0 .. 1.0 instead.

-- Darren Duncan

Re: Suggested magic for a .. b


Darren Duncan wrote:

Dave Whipp wrote:

Similarly (0..1).Seq should most likely return Real numbers


No it shouldn't, because the endpoints are integers.

If you want Real numbers, then say 0.0 .. 1.0 instead.

-- Darren Duncan


That would be inconsistent. $x ~~ 0..1 means 0 = $x = 1. The fact that 
the endpoints are integers does not imply the the range does not include 
non-integer reals.


My argument is that iterating a range could be defined to give you a 
uniform distribution of values that would smart match true against that 
range -- and that such a definition would be just as reasonable as (and 
perhaps more general than) one that says that you get an incrementing 
ordered set of integers across that range.

Re: Suggested magic for a .. b

2010-07-28 Thread Aaron Sherman

On Wed, Jul 28, 2010 at 6:24 PM, Dave Whipp d...@dave.whipp.name wrote:

 Aaron Sherman wrote:

 On Wed, Jul 28, 2010 at 11:34 AM, Dave Whipp d...@dave.whipp.name
 wrote:

  To squint at this slightly, in the context that we already have 0...1e10
 as
 a sequence generator, perhaps the semantics of iterating a range should
 be
 unordered -- that is,

  for 0..10 - $x { ... }

 is treated as

  for (0...10).pick(*) - $x { ... }


 As others have pointed out, this has some problems. You can't implement
 0..*
 that way, just for starters.


 I'd say that' a point in may favor: it demonstrates the integers and
 strings have similar problems. If you pick items from an infinite set then
 every item you pick will have an infinite number of digits/characters.


So, if I understand you correctly, you're happy about the fact that
iterating over and explicitly lazy range would immediately result in
failure? Sorry, not following.



 In smart-match context, a..b includes aardvark.


No one has yet explained to me why that makes sense. The continued use of
ASCII examples, of course, doesn't help. Does a .. b include æther?
This is where Germans and Swedes, for example, don't agree, but they're all
using the same Latin code blocks.

I don't think you can reasonably bring locale into this. I think it needs to
be purely a codepoint-oriented operator. If you bring locale into it, then
the argument for not including composing an modifying characters goes out
the window, and you're stuck in what I believe Dante called the Unicode
circle. If you treat this as a codepoint-based operator then you get a very
simple result: a..b is the range between the codepoint for a and the
codepoint for b. aa .. bb is the range between a sequence of two
codepoints and a sequence of two other code points, which you can define in
a number of ways (we've discussed a few, here) which don't involve having to
expand the sequences to three or more codepoints.

I've never accepted that the range between two strings of identical length
should include strings of another length. That seems maximally non-intuitive
(well, I suppose you could always return the last 100 words of Hamlet as an
iterable IO object if you really wanted to confuse people), and makes string
and integer ranges far too divergent.



  Then the whole question of reversibility is moot.

 Really? I don't think it is. In fact, you've simply made the problem pop
 up
 everywhere, and guaranteed that .. must behave totally unlike any other
 iterator.


 %hash.keys has similarly unordered semantics.


Unordered semantics and shuffled values aren't the same thing. The reason
that hash keys are unordered is that we cannot guarantee that any given
implementation will store entries in any given relation to the input. Ranges
have a well defined ordering associated with the elements that fall within
the range by virtue of the basic definition of a range (LHS = * = RHS).
Hashes have no ordering associated with their keys (though one can be
imposed, e.g. by sort).


Therefore %hash.keys.reverse is, for most purposes, equivalent to
 %hash.keys.


Argh! No, that's entirely untrue. %hash.keys and %hash.keys.reverse had
better be the same elements, but reversed for all hashes which remain
unmodified between the first and second call.


-- 
Aaron Sherman
Email or GTalk: a...@ajs.com
http://www.ajs.com/~ajs

Re: Suggested magic for a .. b

2010-07-28 Thread Aaron Sherman

On Wed, Jul 28, 2010 at 6:24 PM, Dave Whipp d...@dave.whipp.name wrote:

 Aaron Sherman wrote:

 On Wed, Jul 28, 2010 at 11:34 AM, Dave Whipp d...@dave.whipp.name
 wrote:

  To squint at this slightly, in the context that we already have 0...1e10
 as
 a sequence generator, perhaps the semantics of iterating a range should
 be
 unordered -- that is,

  for 0..10 - $x { ... }

 is treated as

  for (0...10).pick(*) - $x { ... }


 As others have pointed out, this has some problems. You can't implement
 0..*
 that way, just for starters.


 I'd say that' a point in may favor: it demonstrates the integers and
 strings have similar problems. If you pick items from an infinite set then
 every item you pick will have an infinite number of digits/characters.


So, if I understand you correctly, you're happy about the fact that
iterating over and explicitly lazy range would immediately result in
failure? Sorry, not following.



 In smart-match context, a..b includes aardvark.


No one has yet explained to me why that makes sense. The continued use of
ASCII examples, of course, doesn't help. Does a .. b include æther?
This is where Germans and Swedes, for example, don't agree, but they're all
using the same Latin code blocks.

I don't think you can reasonably bring locale into this. I think it needs to
be purely a codepoint-oriented operator. If you bring locale into it, then
the argument for not including composing an modifying characters goes out
the window, and you're stuck in what I believe Dante called the Unicode
circle. If you treat this as a codepoint-based operator then you get a very
simple result: a..b is the range between the codepoint for a and the
codepoint for b. aa .. bb is the range between a sequence of two
codepoints and a sequence of two other code points, which you can define in
a number of ways (we've discussed a few, here) which don't involve having to
expand the sequences to three or more codepoints.

I've never accepted that the range between two strings of identical length
should include strings of another length. That seems maximally non-intuitive
(well, I suppose you could always return the last 100 words of Hamlet as an
iterable IO object if you really wanted to confuse people), and makes string
and integer ranges far too divergent.



  Then the whole question of reversibility is moot.

 Really? I don't think it is. In fact, you've simply made the problem pop
 up
 everywhere, and guaranteed that .. must behave totally unlike any other
 iterator.


 %hash.keys has similarly unordered semantics.


Unordered semantics and shuffled values aren't the same thing. The reason
that hash keys are unordered is that we cannot guarantee that any given
implementation will store entries in any given relation to the input. Ranges
have a well defined ordering associated with the elements that fall within
the range by virtue of the basic definition of a range (LHS = * = RHS).
Hashes have no ordering associated with their keys (though one can be
imposed, e.g. by sort).


Therefore %hash.keys.reverse is, for most purposes, equivalent to
 %hash.keys.


Argh! No, that's entirely untrue. %hash.keys and %hash.keys.reverse had
better be the same elements, but reversed for all hashes which remain
unmodified between the first and second call.


-- 
Aaron Sherman
Email or GTalk: a...@ajs.com
http://www.ajs.com/~ajs

Re: Suggested magic for a .. b

Darren Duncan wrote:
 Does ... also come with the 4 variations of endpoint inclusion/exclusion?

 If not, then it should, as I'm sure many times one would want to do this,
 say:

  for 0...^$n - {...}

You can toggle the inclusion/exclusion of the ending condition by
choosing between ... and ...^; but the starting point is the
starting point no matter what: there is neither ^... nor ^...^.

 In any event, I still think that the mnemonics of ... (yadda-yadda-yadda)
 are more appropriate to a generator, where it says produce this and so on.
  A .. does not have that mnemonic and looks better for an interval.

Well put.  This++.

-- 
Jonathan Dataweaver Lang

Re: Suggested magic for a .. b

Aaron Sherman wrote:
 In smart-match context, a..b includes aardvark.


 No one has yet explained to me why that makes sense. The continued use of
 ASCII examples, of course, doesn't help. Does a .. b include æther?
 This is where Germans and Swedes, for example, don't agree, but they're all
 using the same Latin code blocks.

This is definitely something for the Unicode crowd to look into.  But
whatever solution you come up with, please make it compatible with the
notion that aardvark..apple can be used to match any word in the
dictionary that comes between those two words.

 I've never accepted that the range between two strings of identical length
 should include strings of another length. That seems maximally non-intuitive
 (well, I suppose you could always return the last 100 words of Hamlet as an
 iterable IO object if you really wanted to confuse people), and makes string
 and integer ranges far too divergent.

This is why I dislike the notion of the range operator being used to
produce lists: the question of what values you'd get by iterating from
one string value to another is _very_ different from the question of
what string values qualify as being between the two.  The more you use
infix:.. to produce lists, the more likely you are to conflate lists
with ranges.

-- 
Jonathan Dataweaver Lang

Re: Suggested magic for a .. b


On 2010-07-29 00:24, Dave Whipp wrote:

Aaron Sherman wrote:
On Wed, Jul 28, 2010 at 11:34 AM, Dave Whipp d...@dave.whipp.name 
wrote:


To squint at this slightly, in the context that we already have 
0...1e10 as
a sequence generator, perhaps the semantics of iterating a range 
should be

unordered -- that is,

 for 0..10 - $x { ... }

is treated as

 for (0...10).pick(*) - $x { ... }



As others have pointed out, this has some problems. You can't 
implement 0..*

that way, just for starters.


I'd say that' a point in may favor: it demonstrates the integers and 
strings have similar problems. If you pick items from an infinite set 
then every item you pick will have an infinite number of 
digits/characters.


In smart-match context, a..b includes aardvark. It follows that, 
unless you're filtering/shaping the sequence of generated items, then 
almost every element (a..b).Seq starts with an infinite number of 
as.


Consistent semantics would make a..b very not-useful when used as 
a sequence: the user needs to say how they want to avoid the 
infinities. Similarly (0..1).Seq should most likely return Real 
numbers -- and thus (0..1).pick(*) can be approximated by 
(0..1).pick(*, :replace), which is much easier to implement.
I agree that /in theory/ coercing from Range to Sequence, the new 
Sequence should produce every possible value in the Range, unless you 
specify an increment. You could argue that 0 and 1 in (0..1).Seq are 
Ints, resulting in the expansion 0, 1, but that would leave a door open 
for very nasty surprises.


In practise, producing every possible value in a Range with 
over-countable items isn't useful and just opens the door for 
inexperienced programmers to make perl run out of memory without ever 
producing a warning, so I'd suggest that the conversion should fail 
unless an increment is specified.


The general principle would be to avoid meaningless conversions, so (1 
.. *).Seq  (1 .. *).pick should also just fail, but with finite 
endpoints, it could succeed. The question here is whether we should open 
for more parallelization at the cost of simplicity. I don't know.


So either you define some arbitrary semantics (what those should be 
is, I think, the original topic of this thread) or else you punt 
(error message). An error message has the advantage that you can 
always do something useful, later.
I second that just doing something arbitrary where no actual definition 
exists is a really bad idea. To be more specific, there should be no 
.succ or .pred methods on Rat, Str, Real, Complex and anything else that 
is over-countable. Trying to implement .succ on something like Str is 
most likely dwimmy to a very narrow set of applications, but will 
confuse everyone else.


Just to illustrate my point, if we have .succ on Str, why not have it on 
Range or Seq?


Let's just play with that idea for a second - what would a reasonable 
implementation of .succ on Range be?


(1 .. 10).succ --?-- (1 .. 11)
(1 .. 10).succ --?-- (2 .. 11)
(1 .. 10).succ --?-- (1 .. 12)
(1 .. 10).succ --?-- (10^ .. *)

Even starting a discussion about which implementation of .succ for Range 
(above), Str, Rat or Real completely misses the point: there is no 
definition of this function for those domains. It is non-existent and 
trying to do something dwimmy is just confusing.


As a sidenote, ++ and .succ should be treated as two different things 
(just like -- and .pred). ++ really means add one everywhere and can 
be kept as such, where .succ means the next, smallest possible item. 
This means that we can keep ++ and -- for all numeric types.


Coercing to Sequence from Range should by default use .succ on the LHS, 
whereas Seq could just use ++ semantics as often as desired. This would 
make Ranges completely consistent and provide a clear distinction 
between the two classes.

Getting back to 10..0


Yes, I agree with Jon that this should be an empty range. I don't care 
what order you pick the elements from an empty range :).

Either empty, the same as 0 .. 10 or throw an error (I like errors :).

Regards,

Michael.

Re: Suggested magic for a .. b


On 2010-07-29 01:39, Jon Lang wrote:

Aaron Sherman wrote:


In smart-match context, a..b includes aardvark.


No one has yet explained to me why that makes sense. The continued use of
ASCII examples, of course, doesn't help. Does a .. b include æther?
This is where Germans and Swedes, for example, don't agree, but they're all
using the same Latin code blocks.


This is definitely something for the Unicode crowd to look into.  But
whatever solution you come up with, please make it compatible with the
notion that aardvark..apple can be used to match any word in the
dictionary that comes between those two words.


The key issue here is whethere there is a well defined and meaningful 
ordering of the characters in question. We keep discussing the nice 
examples, but how about apple .. ส้ม?


I don't know enough about Unicode to suggest how to solve this. All I 
can say is that my example above should never return a valid Range 
object unless there is a way I can specify my own ordering and I use it.



I've never accepted that the range between two strings of identical length
should include strings of another length. That seems maximally non-intuitive
(well, I suppose you could always return the last 100 words of Hamlet as an
iterable IO object if you really wanted to confuse people), and makes string
and integer ranges far too divergent.


This is why I dislike the notion of the range operator being used to
produce lists: the question of what values you'd get by iterating from
one string value to another is _very_ different from the question of
what string values qualify as being between the two.  The more you use
infix:..  to produce lists, the more likely you are to conflate lists
with ranges.


I second the above. Ranges are all about comparing things. $x ~~ $a .. 
$b means is $x between $a and $b?. The only broadly accepted 
comparison of strings is lexicographical comparison. To illustrate the 
point: wouldn't you find it odd if 2.01 wasn't in between 1.1 and 2.1? 
Really?


Regards,

Michael.

Re: Suggested magic for a .. b

Michael Zedeler wrote:
 Jon Lang wrote:
 This is definitely something for the Unicode crowd to look into.  But
 whatever solution you come up with, please make it compatible with the
 notion that aardvark..apple can be used to match any word in the
 dictionary that comes between those two words.

 The key issue here is whether there is a well defined and meaningful
 ordering of the characters in question. We keep discussing the nice
 examples, but how about apple .. ส้ม?

All I'm saying is: don't throw out the baby with the bathwater.  Come
up with an interim solution that handles the nice examples intuitively
and the ugly examples poorly (or better, if you can manage that right
out of the gate); then revise the model to improve the handling of the
ugly examples as much as you can; but while you do so, make an effort
to keep the nice examples working.

 I don't know enough about Unicode to suggest how to solve this. All I can
 say is that my example above should never return a valid Range object unless
 there is a way I can specify my own ordering and I use it.

That actually says something: it says that we may want to reconsider
the notion that all string values can be sorted.  You're suggesting
the possibility that a cmp ส้ is, by default, undefined.

There are some significant problems that arise if you do this.

-- 
Jonathan Dataweaver Lang

Re: Suggested magic for a .. b

2010-07-28 Thread Chris Fields

On Jul 28, 2010, at 1:37 PM, Mark J. Reed wrote:

 On Wed, Jul 28, 2010 at 2:30 PM, Chris Fields cjfie...@illinois.edu wrote:
 On Jul 28, 2010, at 1:27 PM, Mark J. Reed wrote:
 Can I get an Amen?  Amen!
 --
 Mark J. Reed markjr...@gmail.com
 
 +1.  I'm agnostic ;
 
 Militant?  :)  ( http://tinyurl.com/3xjgxnl )
 
 Nothing inherently religious about amen (or me), but I'll accept
 +1 as synonymous.   :)
 
 -- 
 Mark J. Reed markjr...@gmail.com

Not militant, just trying to inject a bit of humor into the zombie thread that 
won't die.

chris

Re: Suggested magic for a .. b

2010-07-28 Thread Chris Fields

On Jul 28, 2010, at 1:27 PM, Mark J. Reed wrote:

 On Wednesday, July 28, 2010, Jon Lang datawea...@gmail.com wrote:
 Keep it simple, folks!  There are enough corner cases in Perl 6 as
 things stand; we don't need to be introducing more of them if we can
 help it.
 
 Can I get an Amen?  Amen!
 -- 
 Mark J. Reed markjr...@gmail.com

+1.  I'm agnostic ;

chris

Re: Suggested magic for a .. b


On 2010-07-29 02:19, Jon Lang wrote:

Michael Zedeler wrote:
   

Jon Lang wrote:
 

This is definitely something for the Unicode crowd to look into.  But
whatever solution you come up with, please make it compatible with the
notion that aardvark..apple can be used to match any word in the
dictionary that comes between those two words.
   

The key issue here is whether there is a well defined and meaningful
ordering of the characters in question. We keep discussing the nice
examples, but how about apple .. ส้ม?
 

All I'm saying is: don't throw out the baby with the bathwater.  Come
up with an interim solution that handles the nice examples intuitively
and the ugly examples poorly (or better, if you can manage that right
out of the gate); then revise the model to improve the handling of the
ugly examples as much as you can; but while you do so, make an effort
to keep the nice examples working.
   
I am sorry if what I write is understood as an argument against ranges 
of strings. I think I know too little about Unicode to be able to do 
anything but point at some issues, I belive we'll have to deal with. The 
solution is not obvious to me.

I don't know enough about Unicode to suggest how to solve this. All I can
say is that my example above should never return a valid Range object unless
there is a way I can specify my own ordering and I use it.
 

That actually says something: it says that we may want to reconsider
the notion that all string values can be sorted.  You're suggesting
the possibility that a cmp ส้ is, by default, undefined.
   

Yes, but I am sure its due to my lack of understanding of Unicode.

Regards,

Michael.

Re: Suggested magic for a .. b


Jon Lang wrote:

I don't know enough about Unicode to suggest how to solve this. All I can
say is that my example above should never return a valid Range object unless
there is a way I can specify my own ordering and I use it.


That actually says something: it says that we may want to reconsider
the notion that all string values can be sorted.  You're suggesting
the possibility that a cmp ส้ is, by default, undefined.


I think that a general solution here is to accept that there may be more than 
one valid way to sort some types, strings especially, and so operators/routines 
that do sorting should be customizable in some way so users can pick the 
behaviour they want.


The customization could be applied at various levels, such as using an extra 
argument or trait for the operator/function that cares about ordering, or by 
using an extra attribute or trait for the types being sorted.


In fact, this whole issue is very close in concept to the situations where you 
need to do equality/identity tests.


With strings, identity tests can change answers depending on whether you are 
doing it on language-dependent or language-independent graphemes, and Perl 6 
encodes that abstraction level as value metadata.


When you want to be consistent, the behaviour of cmp affects all of the other 
order-sensitive operations, including any working with intervals.


Some possible examples of customization:

  $foo ~~ $a..$b :QuuxNationality  # just affects this one test

  $bar = 'hello' :QuuxNationality  # applies anywhere the Str value is used

Also, declaring a Str subtype or something.

Of course, after all this, we still want some reasonable default.  I suggest 
that for Str that aren't nationality-specific, the default ordering semantics 
are by whatever generic ordering Unicode defines, which might be by codepoint. 
And then for Str with nationality-specific grapheme abstractions, the default 
sorting can be whatever is the case for that nationality.  And this is how it is 
except where users define some other order.


So then, a cmp ส้ is always defined, but users can change the definition.

-- Darren Duncan

Re: Suggested magic for a .. b

2010-07-28 Thread Brandon S Allbery KF8NH

 On 7/28/10 8:07 PM, Michael Zedeler wrote:
 On 2010-07-29 01:39, Jon Lang wrote:
 Aaron Sherman wrote:
 In smart-match context, a..b includes aardvark.
 No one has yet explained to me why that makes sense. The continued
 use of
 ASCII examples, of course, doesn't help. Does a .. b include
 æther?
 This is where Germans and Swedes, for example, don't agree, but
 they're all
 using the same Latin code blocks.
 This is definitely something for the Unicode crowd to look into.  But
 whatever solution you come up with, please make it compatible with the
 notion that aardvark..apple can be used to match any word in the
 dictionary that comes between those two words.
 The key issue here is whethere there is a well defined and meaningful
 ordering of the characters in question. We keep discussing the nice
 examples, but how about apple .. ส้ม?

I thought that was already disallowed by spec.

Re: Suggested magic for a .. b

2010-07-27 Thread Aaron Sherman

Sorry I haven't responded for so long... much going on in my world.

On Mon, Jul 26, 2010 at 11:35 AM, Nicholas Clark n...@ccl4.org wrote:

 On Tue, Jul 20, 2010 at 07:31:14PM -0400, Aaron Sherman wrote:

  2) We deny that a range whose LHS is larger than its RHS makes sense,
 but
  we also don't provide an easy way to construct such ranges lazily
 otherwise.
  This would be annoying only, but then we have declared that ranges are
 the
  right way to construct basic loops (e.g. for (1..1e10).reverse - $i
 {...}
  which is not lazy (blows up your machine) and feels awfully clunky next
 to
  for 1e10..1 - $i {...} which would not blow up your machine, or even
 make
  it break a sweat, if it worked)

 There is no reason why for (1..1e10).reverse - $i {...} should *not* be
 lazy.


As a special case, perhaps you can treat ranges as special and not as simple
iterators. To be honest, I wasn't thinking about the possibility of such
special cases, but about iterators in general. You can't generically reverse
lazy constructs without running afoul of the halting problem, which I invite
you to solve at your leisure ;-)

For example, let's just tie it to integer factorization to make it really
obvious:

 # Generator for ranges of sequential, composite integers
 sub composites(Int $start) { gather do { for $start .. * - $i {
   last if isprime($i);
   take $i;
 } } }
 for composites(10116471302318).reverse - $i { say $i }

The first value should be 10116471302380, but computing that without
iterating through the list from start to finish would require knowing that
none of the integers between 10116471302318 and 10116471302380, inclusive,
are prime. Of course, the same problem exists for any iterator where the end
condition or steps can't be easily pre-computed, but this makes it more
obvious than most.

That means that Range.reverse has to do something special that iterators in
general can't be relied on to do. Does that introduce problems? Not big
ones. I can definitely see people who are used to for ($a .. $b).reverse -
... getting confused when for @blah.reverse - ... blows up their
machine, but avoiding that confusion might not be practical.

PS: On a really abstract note, requiring that ($a .. $b).reverse be lazy
will put new constraints on the right hand side parameter. Previously, it
didn't have to have a value of its own, it just had to be comparable to
other values. for example:

  for $a .. $b - $c { ... }

In that, we don't include the RHS in the output range explicitly. Instead,
we increment a $a (via .succ) until it's = $b. If $a were 1 and $b were an
object that does Int but just implements the comparison features, and has
no fixed numeric value, then it should still work (e.g. it could be random).
Now that's not possible because we need to use the RHS a the starting point
when .reverse is invoked.

I have no idea if that matters, but it's important to be aware of when and
where we constrain the interface rather than discovering it later.

-- 
Aaron Sherman
Email or GTalk: a...@ajs.com
http://www.ajs.com/~ajs

Re: Suggested magic for a .. b

2010-07-27 Thread Jon Lang

Aaron Sherman wrote:
 As a special case, perhaps you can treat ranges as special and not as simple
 iterators. To be honest, I wasn't thinking about the possibility of such
 special cases, but about iterators in general. You can't generically reverse
 lazy constructs without running afoul of the halting problem, which I invite
 you to solve at your leisure ;-)

A really obvious example occurs when the RHS is a Whatever:

   (1..*).reverse;

.reverse magic isn't going to be generically applicable to all lazy
lists; but it can be applicable to all lazy lists that have predefined
start points, end points, and bidirectional iterators, and on all lazy
lists that have random-access iterators and some way of locating the
tail.  Sometimes you can guess what the endpoint and backward-iterator
should be from the start point and the forward-iterator, just as the
infix:... operator is able to guess what the forward-iterator should
be from the first one, two, or three items in the list.

This is especially a problem with regard to lists generated using the
series operator, as it's possible to define a custom forward-iterator
for it (but not, AFAICT, a custom reverse-iterator).  In comparison,
the simplicity of the range operator's list generation algorithm
almost guarantees that as long as you know for certain what or where
the last item is, you can lazily generate the list from its tail.  But
only almost:

   (1..3.5); # list context: 1, 2, 3
   (1..3.5).reverse; # list context: 3.5, 2.5, 1.5 - assuming list is
generated from tail.
   (1..3.5).reverse; # list context: 3, 2, 1 - but only if you
generate it from the head first, and then reverse it.

Again, the proper tool for list generation is the series operator,
because it can do everything that the range operator can do in terms
of list generation, and more.

1 ... 3.5 # same as 1, 2, 3
3.5 ... 1 # same as 3.5, 2.5, 1.5 - and obviously so.

With this in mind, I see no reason to allow any magic on .reverse when
dealing with the range operator (or the series operator, for that
matter): as far as it's concerned, it's dealing with a list that lacks
a reverse-iterator, and so it will _always_ generate the list from its
head to its tail before attempting to reverse it.  Maybe at some later
point, after we get Perl 6.0 out the door, we can look into revising
the series operator to permit more powerful iterators so as to allow
.reverse and the like to bring more dwimmy magic to bear.

-- 
Jonathan Dataweaver Lang

Re: Suggested magic for a .. b

2010-07-27 Thread Michael Zedeler


On 2010-07-27 23:50, Aaron Sherman wrote:

PS: On a really abstract note, requiring that ($a .. $b).reverse be lazy
will put new constraints on the right hand side parameter. Previously, it
didn't have to have a value of its own, it just had to be comparable to
other values. for example:

   for $a .. $b -  $c { ... }

In that, we don't include the RHS in the output range explicitly. Instead,
we increment a $a (via .succ) until it's= $b. If $a were 1 and $b were an
object that does Int but just implements the comparison features, and has
no fixed numeric value, then it should still work (e.g. it could be random).
Now that's not possible because we need to use the RHS a the starting point
when .reverse is invoked.

This is exactly why I keep writing posts about Ranges being defunct as 
they have been specified now. If we accept the premise that Ranges are 
supposed to define a kind of linear membership specification between two 
starting points (as in math), it doesn't make sense that the LHS has an 
additional constraint (having to provide a .succ method). All we should 
require is that both endpoints supports comparison (that they share a 
common type with comparison, at least).


To provide expansion to lists, such as for $a .. $b - $c { ... }, we 
should use type coercion semantics, coercing from Range to Sequence and 
throw an error if the LHS doesn't support .succ.


Writing ($a .. $b).reverse doesn't make any sense if the result were a 
new Range, since Ranges should then only be used for inclusion tests (so 
swapping endpoints doesn't have any meaningful interpretation), but 
applying .reverse could result in a coercion to Sequence.


Writing for ($a .. $b).reverse - $c { ...} may then blow up because it 
turns out that $b doesn't have a .succ method when coercing to sequence 
(where the LHS must have an initial value), just like for $a .. $b - $c 
{ ... } should be able to blow up because the LHS of a Range shouldn't 
have to support .succ.


Regards,

Michael.

Re: Suggested magic for a .. b

2010-07-26 Thread Nicholas Clark

On Tue, Jul 20, 2010 at 07:31:14PM -0400, Aaron Sherman wrote:

 2) We deny that a range whose LHS is larger than its RHS makes sense, but
 we also don't provide an easy way to construct such ranges lazily otherwise.
 This would be annoying only, but then we have declared that ranges are the
 right way to construct basic loops (e.g. for (1..1e10).reverse - $i {...}
 which is not lazy (blows up your machine) and feels awfully clunky next to
 for 1e10..1 - $i {...} which would not blow up your machine, or even make
 it break a sweat, if it worked)

There is no reason why for (1..1e10).reverse - $i {...} should *not* be lazy.

After all, Perl 5 now implements

@b = reverse sort @a

by directly sorting in reverse. Note how it's now an ex-reverse:

$ perl -MO=Concise -e '@b = reverse sort @a'
c  @ leave[1 ref] vKP/REFC -(end)
1 0 enter -2
2 ; nextstate(main 1 -e:1) v -3
b 2 aassign[t6] vKS -c
-1 ex-list lK -8
3   0 pushmark s -4
-   1 ex-reverse lK/1 --
4  0 pushmark s -5
7  @ sort lK/REV -8
- 0 ex-pushmark s -5
6 1 rv2av[t4] lK/1 -7
5# gv[*a] s -6
-1 ex-list lK -b
8   0 pushmark s -9
a   1 rv2av[t2] lKRM*/1 -b
9  # gv[*b] s -a
-e syntax OK

Likewise

foreach (reverse @a) {...}

is implemented as a reverse iterator on the array, rather than a temporary
list:

$ perl -MO=Concise -e 'foreach(reverse @a) {}'
d  @ leave[1 ref] vKP/REFC -(end)
1 0 enter -2
2 ; nextstate(main 2 -e:1) v -3
c 2 leaveloop vK/2 -d
7{ enteriter(next-9 last-c redo-8) lKS/REVERSED -a
-   0 ex-pushmark s -3
-   1 ex-list lKM -6
3  0 pushmark s -4
-  1 ex-reverse lKM/1 -6
- 0 ex-pushmark s -4
5 1 rv2av[t2] sKR/1 -6
4# gv[*a] s -5
6   # gv[*_] s -7
-1 null vK/1 -c
b   | and(other-8) vK/1 -c
a  0 iter s/REVERSED -b
-  @ lineseq vK --
8 0 stub v -9
9 0 unstack v -a
-e syntax OK



If it's part of the specification that (1..1e10).reverse is to be implemented
lazily, I'd (personally) consider that an easy enough way to construct a lazy
range.


This doesn't answer any of your other questions about what ranges of
character strings should mean. I don't really have an opinion, other than
it needs to be simple enough to be teachable.

Nicholas Clark

Re: Suggested magic for a .. b

2010-07-21 Thread Smylers

Jon Lang writes:

 Approaching this with the notion firmly in mind that infix:.. is
 supposed to be used for matching ranges while infix:... should be
 used to generate series:
 
 With series, we want C $LHS ... $RHS  to generate a list of items
 starting with $LHS and ending with $RHS.  If $RHS  $LHS, we want it
 to increment one step at a time; if $RHS  $LHS, we want it to
 decrement one step at a time.

Do we? I'm used to generating lists and iterating over them (in Perl 5)
with things like like:

  for (1 .. $max)

where the intention is that if $max is zero, the loop doesn't execute at
all. Having the equivalent Perl 6 list generation operator, C...,
start counting backwards could be confusing.

Especially if Perl 6 also has a range operator, C.., which would Do
The Right Thing for me in this situation, and where the Perl 6 operator
that Does The Right Thing is spelt the same as the Perl 5 operator that
I'm used to; that muddles the distinction you make above about matching
ranges versus generating lists.

Smylers
-- 
http://twitter.com/Smylers2

Re: Suggested magic for a .. b

2010-07-21 Thread Jon Lang

Smylers wrote:
 Jon Lang writes:
 Approaching this with the notion firmly in mind that infix:.. is
 supposed to be used for matching ranges while infix:... should be
 used to generate series:

 With series, we want C $LHS ... $RHS  to generate a list of items
 starting with $LHS and ending with $RHS.  If $RHS  $LHS, we want it
 to increment one step at a time; if $RHS  $LHS, we want it to
 decrement one step at a time.

 Do we?

Yes, we do.

 I'm used to generating lists and iterating over them (in Perl 5)
 with things like like:

  for (1 .. $max)

 where the intention is that if $max is zero, the loop doesn't execute at
 all. Having the equivalent Perl 6 list generation operator, C...,
 start counting backwards could be confusing.

 Especially if Perl 6 also has a range operator, C.., which would Do
 The Right Thing for me in this situation, and where the Perl 6 operator
 that Does The Right Thing is spelt the same as the Perl 5 operator that
 I'm used to; that muddles the distinction you make above about matching
 ranges versus generating lists.

It does muddy the difference, which is why my own gut instinct would
have been to do away with infix:..'s ability to generate lists.
Fortunately, I'm not in charge here, and wiser heads than mine have
decreed that infix:.., when used in list context, will indeed
generate a list in a manner that closely resembles Perl 5's range
operator: start with the LHS, then increment until you equal or exceed
the RHS - and if you start out exceeding the RHS, you've got yourself
an empty list.

You can do the same thing with the infix:... operator, too; but
doing so will be bulkier (albeit much more intuitive).  For example,
the preferred Perl 6 approach to what you described would be:

for 1, 2 ... $x

The two-element list on the left of the series operator invokes a bit
of magic that tells it that the algorithm for generating the next step
in the series is to invoke the increment operator.  This is all
described in S03 in considerable detail; I suggest rereading the
section there concerning the series operator before passing judgment
on it.  .

--
Jonathan Dataweaver Lang

Re: Suggested magic for a .. b

2010-07-21 Thread Mark J. Reed

Ok, I find that surprising (and counter to current Rakudo behavior),
but thanks for the correction, and sorry about the misinformation.

On Wednesday, July 21, 2010, Larry Wall la...@wall.org wrote:
 On Tue, Jul 20, 2010 at 11:53:27PM -0400, Mark J. Reed wrote:
 : In particular, consider that pi ~~ 0..4 is true,
 :  because pi is within the range; but pi ~~ 0...4 is false, because pi
 : is not one of the generated elements.

 Small point here, it's not because pi is fractional: 3 ~~ 0...4 is
 also false because 3 !eqv (0,1,2,3,4).  There is no implicit any()
 on a smartmatch list pattern as there is in Perl 5.  In Perl 6 the
 pattern 0..4 may only match a list with the same 5 elements in the
 same order.

 Larry


-- 
Mark J. Reed markjr...@gmail.com

Re: Suggested magic for a .. b

2010-07-21 Thread Mark J. Reed

Strike the counter to current Rakudo behavior bit; Rakudo is
behaving as specified in this instance.  I must have been
hallucinating.

On Wed, Jul 21, 2010 at 7:33 AM, Mark J. Reed markjr...@gmail.com wrote:
 Ok, I find that surprising (and counter to current Rakudo behavior),
 but thanks for the correction, and sorry about the misinformation.

 On Wednesday, July 21, 2010, Larry Wall la...@wall.org wrote:
 On Tue, Jul 20, 2010 at 11:53:27PM -0400, Mark J. Reed wrote:
 : In particular, consider that pi ~~ 0..4 is true,
 :  because pi is within the range; but pi ~~ 0...4 is false, because pi
 : is not one of the generated elements.

 Small point here, it's not because pi is fractional: 3 ~~ 0...4 is
 also false because 3 !eqv (0,1,2,3,4).  There is no implicit any()
 on a smartmatch list pattern as there is in Perl 5.  In Perl 6 the
 pattern 0..4 may only match a list with the same 5 elements in the
 same order.

 Larry


 --
 Mark J. Reed markjr...@gmail.com




-- 
Mark J. Reed markjr...@gmail.com

Re: Suggested magic for a .. b

2010-07-21 Thread Larry Wall

On Wed, Jul 21, 2010 at 09:23:11AM -0400, Mark J. Reed wrote:
: Strike the counter to current Rakudo behavior bit; Rakudo is
: behaving as specified in this instance.  I must have been
: hallucinating.

Well, except that we both neglected precedence.   Since ... is looser
than ~~, it must be written 3 ~~ (0...4).  :-)

Larry

Re: Suggested magic for a .. b

2010-07-21 Thread Aaron Sherman

On Wed, Jul 21, 2010 at 1:28 AM, Aaron Sherman a...@ajs.com wrote:


 For reference, this is the relevant section of the spec:

 Character positions are incremented within their natural range for any
 Unicode range that is deemed to represent the digits 0..9 or that is deemed
 to be a complete cyclical alphabet for (one case of) a (Unicode) script.
 Only scripts that represent their alphabet in codepoints that form a cycle
 independent of other alphabets may be so used. (This specification defers to
 the users of such a script for determining the proper cycle of letters.) We
 arbitrarily define the ASCII alphabet not to intersect with other scripts
 that make use of characters in that range, but alphabets that intersperse
 ASCII letters are not allowed.


 I'm not sure that all of that tracks with the Unicode standard's use of
 some of the terms, but based on what we've discussed, perhaps we could get
 more specific there:

 Character positions are incremented within their Unicode Script, but only
 in keeping with their General Category property. Thus CA++ yields CB
 which is the next codepoint, but CĂ++ yields CĄ even though ą
 falls between the two, when incrementing codepoints. Should this prove
 problematic for any specific Unicode Script which requires special handling
 (e.g. because a letter really isn't used as a letter at all), such special
 handling may be applied, but the above is the general rule.


Oh, so close! I realized that I broke the original spec, here. We need to
add back in:

There are two special cases: the ASCII-compatible lower-case letters (a-z)
and the ASCII-compatible upper-case letters (A-Z). For historical reasons,
these, by default, will not increment past the end of their ranges into the
higher-codepoint Latin characters.


Note: we might want a pragma for that as well. I'd suggest that perhaps it
should be a locale-specific feature? So, if you set your locale to fr, then
you include in those ranges all of the Latin characters used in French.

-- 
Aaron Sherman
Email or GTalk: a...@ajs.com
http://www.ajs.com/~ajs

Re: Suggested magic for a .. b

2010-07-21 Thread Darren Duncan


Larry Wall wrote:

On Tue, Jul 20, 2010 at 11:53:27PM -0400, Mark J. Reed wrote:
: In particular, consider that pi ~~ 0..4 is true,
:  because pi is within the range; but pi ~~ 0...4 is false, because pi
: is not one of the generated elements.

Small point here, it's not because pi is fractional: 3 ~~ 0...4 is
also false because 3 !eqv (0,1,2,3,4).  There is no implicit any()
on a smartmatch list pattern as there is in Perl 5.  In Perl 6 the
pattern 0..4 may only match a list with the same 5 elements in the
same order.


For some reason I thought smart match in Perl 6, when presented with some 
collection on the right-hand side, would test if the value on the left-hand side 
was contained in the collection.


So, for example:

  my @ary = (1,4,3,2,9);
  my $test = 3;
  $test ~~ @ary;  # TRUE

Similarly, since a range represents a set of all values between 2 endpoints, I 
might have thought this would be reasonable:


  3 ~~ 1..5  # TRUE

So if that doesn't work, then what is the canonical way to ask if a value is in 
a range?


Would any of these be reasonable?

  3 ~~ any(1..5)

  3 in 1..5

  3 ∈ 1..5  # Unicode alternative

-- Darren Duncan

Re: Suggested magic for a .. b

2010-07-21 Thread Mark J. Reed

On Wed, Jul 21, 2010 at 3:55 PM, Darren Duncan dar...@darrenduncan.net wrote:
 Larry Wall wrote:

 On Tue, Jul 20, 2010 at 11:53:27PM -0400, Mark J. Reed wrote:
 : In particular, consider that pi ~~ 0..4 is true,
 :  because pi is within the range; but pi ~~ 0...4 is false, because pi
 : is not one of the generated elements.

 Small point here, it's not because pi is fractional: 3 ~~ 0...4 is
 also false because 3 !eqv (0,1,2,3,4).  There is no implicit any()
 on a smartmatch list pattern as there is in Perl 5.  In Perl 6 the
 pattern 0..4 may only match a list with the same 5 elements in the
 same order.

 For some reason I thought smart match in Perl 6, when presented with some
 collection on the right-hand side, would test if the value on the left-hand
 side was contained in the collection.

That was my thought as well.

 Similarly, since a range represents a set of all values between 2 endpoints,
 I might have thought this would be reasonable:

  3 ~~ 1..5  # TRUE

AIUI, that is indeed correct.  Ranges smartmatch by testing for
inclusion in the range.  But collections don't smartmatch by testing
for inclusion in the collection.  Which was probably the subject of a
thread I missed somewhere...

For series, I think the canonical solution is to use any().

-- 
Mark J. Reed markjr...@gmail.com

Re: Suggested magic for a .. b

2010-07-20 Thread Solomon Foster

On Tue, Jul 20, 2010 at 7:31 PM, Aaron Sherman a...@ajs.com wrote:
 2) We deny that a range whose LHS is larger than its RHS makes sense, but
 we also don't provide an easy way to construct such ranges lazily otherwise.
 This would be annoying only, but then we have declared that ranges are the
 right way to construct basic loops (e.g. for (1..1e10).reverse - $i {...}
 which is not lazy (blows up your machine) and feels awfully clunky next to
 for 1e10..1 - $i {...} which would not blow up your machine, or even make
 it break a sweat, if it worked)

Ranges haven't been intended to be the right way to construct basic
loops for some time now.  That's what the ... series operator is
for.

for 1e10 ... 1 - $i {
 # whatever
}

is lazy by the spec, and in fact is lazy and fully functional in
Rakudo.  (Errr... okay, actually it just seg faulted after hitting
968746 in the countdown.  But that's a Rakudo bug unrelated to
this, I'm pretty sure.)

All the magic that one wants for handling loop indices -- going
backwards, skipping numbers, geometric series, and more -- is present
in the series operator.  Range is not supposed to do any of that stuff
other than the most basic forward sequence.

-- 
Solomon Foster: colo...@gmail.com
HarmonyWare, Inc: http://www.harmonyware.com

Re: Suggested magic for a .. b

Solomon Foster wrote:
 Ranges haven't been intended to be the right way to construct basic
 loops for some time now.  That's what the ... series operator is
 for.

    for 1e10 ... 1 - $i {
         # whatever
    }

 is lazy by the spec, and in fact is lazy and fully functional in
 Rakudo.  (Errr... okay, actually it just seg faulted after hitting
 968746 in the countdown.  But that's a Rakudo bug unrelated to
 this, I'm pretty sure.)

You took the words out of my mouth.

 All the magic that one wants for handling loop indices -- going
 backwards, skipping numbers, geometric series, and more -- is present
 in the series operator.  Range is not supposed to do any of that stuff
 other than the most basic forward sequence.

Here, though, I'm not so sure: I'd like to see how many of Aaron's
issues remain unresolved once he reframes them in terms of the series
operator.

-- 
Jonathan Dataweaver Lang

Re: Suggested magic for a .. b

2010-07-20 Thread Solomon Foster

On Tue, Jul 20, 2010 at 10:00 PM, Jon Lang datawea...@gmail.com wrote:
 Solomon Foster wrote:
 Ranges haven't been intended to be the right way to construct basic
 loops for some time now.  That's what the ... series operator is
 for.

    for 1e10 ... 1 - $i {
         # whatever
    }

 is lazy by the spec, and in fact is lazy and fully functional in
 Rakudo.  (Errr... okay, actually it just seg faulted after hitting
 968746 in the countdown.  But that's a Rakudo bug unrelated to
 this, I'm pretty sure.)

 You took the words out of my mouth.

 All the magic that one wants for handling loop indices -- going
 backwards, skipping numbers, geometric series, and more -- is present
 in the series operator.  Range is not supposed to do any of that stuff
 other than the most basic forward sequence.

 Here, though, I'm not so sure: I'd like to see how many of Aaron's
 issues remain unresolved once he reframes them in terms of the series
 operator.

Sorry, didn't mean to imply the series operator was perfect.  (Though
it is surprisingly awesome in  general, IMO.)  Just that the right
questions would be about the series operator rather than Ranges.

The questions definitely look different that way: for example,
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz is easily and
clearly expressed as

'A' ... 'Z', 'a' ... 'z' # don't think this works in Rakudo yet  :(

That suggests to me that the current behavior of 'A' ... 'z' is pretty
reasonable.

-- 
Solomon Foster: colo...@gmail.com
HarmonyWare, Inc: http://www.harmonyware.com

Re: Suggested magic for a .. b

2010-07-20 Thread Aaron Sherman

Side note: you could get around some of the problems, below, but in order to
do so, you would have to exhaustively express all of Unicode using the Str
builtin module's RANGES constant. In fact, as it is now, it defines ASCII
lowercase, but doesn't define Latin lowercase. Presumably because doing so
would be a massive pain. Again, I'll point out that using script and
properties is much easier

On Tue, Jul 20, 2010 at 10:35 PM, Solomon Foster colo...@gmail.com wrote:

Sorry, didn't mean to imply the series operator was perfect. (Though
it is surprisingly awesome in general, IMO.) Just that the right
questions would be about the series operator rather than Ranges.

So, what's the intention of the range operator, then? Is it just there to
offer backward compatibility with Perl 5? Is it a vestige that should be
removed so that we can Huffman ... down to ..?

I'm not trying to be difficult, here, I just never knew that ... could
operate on a single item as LHS, and if it can, then .. seems to be obsolete
and holding some prime operator real estate.

The questions definitely look different that way: for example,
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz is easily and
clearly expressed as

'A' ... 'Z', 'a' ... 'z' # don't think this works in Rakudo yet :(

I still contend that this is so frequently desirable that it should have a
simpler form, but it's still going to have problems.

One example: for expressing Katakana letters (I use letters in the
Unicode sense, here) it's still dicey. There are things interspersed in the
Unicode sequence for Katakana that aren't the same thing at all. Unicode
calls them lowercase, but that's not quite right. They're smaller versions
of Katakana characters which are used more as punctuation or accents than as
syllabic glyphs the way the rest of Katakana is.

I guess you could write:

ア, イ, ウ, エ, オ, カ ... ヂ,ツ ...モ,ヤ, ユ, ヨ ... ロ, ワ ... ヴ (add quotes to taste)

But that seems quite a bit more painful than:

ア .. ヴ (or ... if you prefer)

Similar problems exist for many scripts (including some of Latin, we're just
used to the parts that are odd), though I think it's possible that Katakana
may be the worst because of the mis-use of Ll to indicate a letter when the
truth of the matter is far more complicated.

That suggests to me that the current behavior of 'A' ... 'z' is pretty
reasonable.

You still have to decide to make at least some allowances for invalid
codepoints and I think you should avoid ever generating a combining or
modifying codepoint in such a sequence (e.g. Ѻ ... Ҋ in Cyrillic which
contains several combining characters for currency and counting as well as
one undefined codepoint).

--
Aaron Sherman
Email or GTalk: a...@ajs.com
http://www.ajs.com/~ajs

Re: Suggested magic for a .. b

Approaching this with the notion firmly in mind that infix:.. is
supposed to be used for matching ranges while infix:... should be
used to generate series:

Aaron Sherman wrote:
 Walk with me a bit, and let's explore the concept of intuitive character
 ranges? This was my suggestion, which seems pretty basic to me:

 x .. y, for all strings x and y, which are composed of a single, valid
 codepoint which is neither combining nor modifying, yields the range of all
 valid, non-combining/modifying codepoints between x and y, inclusive which
 share the Unicode script, general category major property and general
 category minor property of either x or y (lack of a minor property is a
 valid value).

This is indeed true for both range-matching and series-generation as
the spec is currently written.

 In general we have four problems with current specification and
 implementation on the Perl 6 and Perl 5 sides:

 1) Perl 5 and Rakudo have a fundamental difference of opinion about what
 some ranges produce (A .. z, X .. T, etc) and yet we've never really
 articulated why we want that.

 2) We deny that a range whose LHS is larger than its RHS makes sense, but
 we also don't provide an easy way to construct such ranges lazily otherwise.
 This would be annoying only, but then we have declared that ranges are the
 right way to construct basic loops (e.g. for (1..1e10).reverse - $i {...}
 which is not lazy (blows up your machine) and feels awfully clunky next to
 for 1e10..1 - $i {...} which would not blow up your machine, or even make
 it break a sweat, if it worked)

With ranges, we want C when $LHS .. $RHS  to always mean C if
$LHS = $_ = $RHS .  If $RHS  $LHS, then the range being specified
is not valid.  In this context, it makes perfect sense to me why it
doesn't generate anything.

With series, we want C $LHS ... $RHS  to generate a list of items
starting with $LHS and ending with $RHS.  If $RHS  $LHS, we want it
to increment one step at a time; if $RHS  $LHS, we want it to
decrement one step at a time.

So: 1) we want different behavior from the Range operator in Perl 6
vs. Perl 5 because we have completely re-envisioned the range
operator.  What we have replaced it with is fundamentally more
flexible, though not necessarily perfect.

 3) We've never had a clear-cut goal in allowing string ranges (as opposed to
 character ranges, which Perl 5 and 6 both muddy a bit), so intuitive
 becomes sketchy at best past the first grapheme, and ever muddier when only
 considering codepoints (thus that wing of my proposal and current behavior
 are on much shakier ground, except in so far as it asserts that we might
 want to think about it more).

I think that one notion that we're dealing with here is the idea that
C $X  $X.succ  for all strings.  This seems to be a rather
intuitive assumption to make; but it is apparently not an assumption
that Stringy.succ makes.  As I understand it, Z.succ eqv AA.  What
benefit do we gain from this behavior?  Is it the idea that eventually
this will iterate over every possible combination of capital letters?
If so, why is that a desirable goal?


My own gut instinct would be to define the string iterator such that
it increments the final letter in the string until it gets to Z;
then it resets that character to A and increments the next character
by one:

ABE, ABF, ABG ... ABZ, ACA, ACB ... ZZZ

This pattern ensures that for any two strings in the series, the first
one will be less than its successor.  It does not ensure that every
possible string between ABE and ZZZ will be represented; far from
it.  But then, 1...9 doesn't produce every number between 1 and 9; it
only produces integers.  Taken to an extreme: pi falls between 1 and
9; but no one in his right mind expects us to come up with a general
sequencing of numbers that increments from 1 to 9 with a guarantee
that it will hit pi before reaching 9.

Mind you, I know that the above is full of holes.  In particular, it
works well when you limit yourself to strings composed of capital
letters; do anything fancier than that, and it falls on its face.

 4) Many ranges involving single characters on LHS and RHS result in null
 or infinite output, which is deeply non-intuitive to me, and I expect many
 others.

Again, the distinction between range-matching and series-generation
comes to the rescue.

 Solve those (and I tried in my suggestion) and I think you will be able to
 apply intuition to character ranges, but only in so far as a human being is
 likely to be able to intuit anything related to Unicode.

Of the points that you raise, #1, 2, and 4 are neatly solved already.
I'm unsure as to #3; so I'd recommend focusing some scrutiny on it.

 The current behaviour of the range operator is (if I recall correctly):
 1) if both sides are single characters, make a range by incrementing
 codepoints


 Sadly, you can't do that reasonably. Here are some examples of why, using
 only Latin and Greek as examples (not the most convoluted Unicode

Re: Suggested magic for a .. b

Aaron Sherman wrote:
So, what's the intention of the range operator, then? Is it just there to
offer backward compatibility with Perl 5? Is it a vestige that should be
removed so that we can Huffman ... down to ..?

I'm not trying to be difficult, here, I just never knew that ... could
operate on a single item as LHS, and if it can, then .. seems to be obsolete
and holding some prime operator real estate.

On the contrary: it is not a vestige, it is not obsolete, and it's
making good use of the prime operator real estate that it's holding.
It's just not doing what it did in Perl 5.

I strongly recommend that you reread S03 to find out exactly what each
of these operators does these days.

The questions definitely look different that way: for example,
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz is easily and
clearly expressed as

'A' ... 'Z', 'a' ... 'z' # don't think this works in Rakudo yet :(

I still contend that this is so frequently desirable that it should have a
simpler form, but it's still going to have problems.

I guess you could write:

ア, イ, ウ, エ, オ, カ ... ヂ,ツ ...モ,ヤ, ユ, ヨ ... ロ, ワ ... ヴ (add quotes to taste)

But that seems quite a bit more painful than:

ア .. ヴ (or ... if you prefer)

Some of this might be addressed by filtering the list as you go -
though I don't remember the method for doing so. Something like
.grep, I think, with a regex in it that only accepts letters:

(ア ... ヴ).«grep(/:alpha:/)

...or something to that effect.

Still, it's possible that we might need something that's more flexible
than that.

--
Jonathan Dataweaver Lang

Re: Suggested magic for a .. b