date:20060509

Re: Scans

2006-05-09 Thread Smylers

Larry Wall writes:

> On Mon, May 08, 2006 at 05:30:23PM +0300, Gaal Yahas wrote:
> 
> : We have a very nifty reduce metaoperator. Scans are a counterpart of
> : reduce that are very useful -- they are the (preferably lazy) list
> : of consecutive accumulated reductions up to the final result.

I'm obviously insufficiently imaginative.  Please can you give a few
examples of these things being very useful?

> Maybe that's just what reduce operators do in list context.

Instinctively I'm resistant to that, cos I can think of situations where
I'd invoke reduce operators (or where I already do the Perl 5
equivalent) wanting a reduction and where the code just happens to be in
list context: in a C call, or in the block of a C.  Having to
remember to use C<~> or C<+> to avoid inadvertently getting something
complicated I don't understand sounds like the kind of thing that would
trip me up.

But this could just be because I don't (yet) grok scans.

Smylers

Re: Scans

2006-05-09 Thread Markus Laire


On 5/9/06, Smylers <[EMAIL PROTECTED]> wrote:

But this could just be because I don't (yet) grok scans.


Here's a simple example:
   [+] 1,2,3,4,5
would return scalar 1+2+3+4+5 as a reduction and list (0, 1, 1+2,
1+2+3, 1+2+3+4, 1+2+3+4+5) as a scan. (0 comes from [+](), i.e. [+]
with no arguments)

--
Markus Laire

Re: Scans

2006-05-09 Thread Smylers

Markus Laire writes:

> On 5/9/06, Smylers <[EMAIL PROTECTED]> wrote:
> 
> > But this could just be because I don't (yet) grok scans.
> 
> Here's a simple example:
>[+] 1,2,3,4,5
> would return scalar 1+2+3+4+5 as a reduction and list (0, 1, 1+2,
> 1+2+3, 1+2+3+4, 1+2+3+4+5) as a scan.

That doesn't help.  I can understand the mechanics of _what_ scans do.
What I'm struggling with is _why_ they are billed as being "very
useful".

So I have the list generated by the scan.  And?  What do I do with it?
I can't think of any situation in my life where I've been wanting such a
list.

Smylers

Re: Scans

2006-05-09 Thread Gaal Yahas

On Mon, May 08, 2006 at 04:02:35PM -0700, Larry Wall wrote:
> : I'm probably not thinking hard enough, so if anyone can come up with an
> : implementation please give it :)  Otherwise, how about we add this to
> : the language?
> 
> Maybe that's just what reduce operators do in list context.

I love this idea and have implemented it in r10246. One question though,
what should a scan for chained ops do?

  list [==] 0, 0, 1, 2, 2;
  # bool::false?
  # (bool::true, bool::true, bool::false, bool::false, bool::false) ?

-- 
Gaal Yahas <[EMAIL PROTECTED]>
http://gaal.livejournal.com/

Re: Scans

2006-05-09 Thread Gaal Yahas

On Tue, May 09, 2006 at 11:23:48AM +0100, Smylers wrote:
> So I have the list generated by the scan.  And?  What do I do with it?
> I can't think of any situation in my life where I've been wanting such a
> list.

Scans are useful when the intermediate results are interesting, as well
as when you want to cut off a stream once some threshold condition is
met.

item [+] 1 .. 10;   # 10th triangular number
list [+] 1 .. 10;   # 10 first triangular number
first { $_ > 42 } [+] 1 ..*  # first triangular number over 42

If you have a series whose sum yields closer and closer approximations
of some value, you can use a scan to efficiently cut off once some
epsilon is reached.

-- 
Gaal Yahas <[EMAIL PROTECTED]>
http://gaal.livejournal.com/

Re: Scans

2006-05-09 Thread Austin Hastings


Gaal Yahas wrote:

On Mon, May 08, 2006 at 04:02:35PM -0700, Larry Wall wrote:
  

: I'm probably not thinking hard enough, so if anyone can come up with an
: implementation please give it :)  Otherwise, how about we add this to
: the language?

Maybe that's just what reduce operators do in list context.



I love this idea and have implemented it in r10246. One question though,
what should a scan for chained ops do?

  list [==] 0, 0, 1, 2, 2;
  # bool::false?
  # (bool::true, bool::true, bool::false, bool::false, bool::false) 
Keeping in mind that the scan will contain the boolean results of the 
comparisons, you'd be comparing 2 with "true" in the later stages of the 
scan. Is that what you intended, or would ~~ be more appropriate?


(And I'm with Smylers on this one: show me a useful example, please.)

=Austin

Re: Scans

2006-05-09 Thread Markus Laire


On 5/9/06, Austin Hastings <[EMAIL PROTECTED]> wrote:

Gaal Yahas wrote:
> I love this idea and have implemented it in r10246. One question though,
> what should a scan for chained ops do?
>
>   list [==] 0, 0, 1, 2, 2;
>   # bool::false?
>   # (bool::true, bool::true, bool::false, bool::false, bool::false)
Keeping in mind that the scan will contain the boolean results of the
comparisons, you'd be comparing 2 with "true" in the later stages of the
scan. Is that what you intended, or would ~~ be more appropriate?


This code
   list [==] 0, 0, 1, 2, 2;
would expand to
   [==] 0,
   0 == 0,
   0 == 0 == 1,
   0 == 0 == 1 == 2,
   0 == 0 == 1 == 2 == 2
which gives
   Bool::True, Bool::True, Bool::False, Bool::False, Bool::False

So you don't compare 2 to "true" in any stage.

ps. Should first element of scan be 0-argument or 1-argument case.
i.e. should list([+] 1) return (0, 1) or (1)

--
Markus Laire

Re: "normalized" hash-keys

2006-05-09 Thread TSa


HaloO,

Smylers wrote:

But why would a hash be doing equality operations at all?


I think it does so in the abstract. A concrete implementation
might use the .id method to get a hash value directly.



 Assuming that
a hash is implemented efficiently, as a hash, then it needs to be able
to map directly from a given key to its corresponding value, not to have
to compare the given key in turn against each of the stored keys to see
if they happen to match under some special meaning of eq.

You snipped Ruud's next bit:



Or broader: that the keys should be normalized (think NFKC()) before
usage?


As $Larry pointed out, this all boils down to getting the key type
of the hash wired into the hash somehow. Assuming it dispatches on
the .id method the following might do:

  class CaseInsensitive does Str
  {
 method id { self.uc.Str::id }
  }

  my %hash{CaseInsensitive};

As usual the compiler might optimize dynamic method lookup away
if the key type is known at compile time.



That seems the obvious way to implement this, that all keys are
normalized (say with C, for this specific example) both on storage
and look-up.  Then the main hashy bit doesn't have to change at all.


Yep. But the @Larry need to confirm that the Hash calls .id on strings
as well as on objects. Note that Array might rely on something similar
to access the numeric key. Then you could overload it e.g. to get a
modulo index:

  class ModuloIndex does Int
  {
 method id { self % 10 }
  }

  my @array[ModuloIndex];
--

Re: Scans

2006-05-09 Thread Mark A. Biggar


Austin Hastings wrote:

Gaal Yahas wrote:

On Mon, May 08, 2006 at 04:02:35PM -0700, Larry Wall wrote:
 
: I'm probably not thinking hard enough, so if anyone can come up 
with an

: implementation please give it :)  Otherwise, how about we add this to
: the language?

Maybe that's just what reduce operators do in list context.



I love this idea and have implemented it in r10246. One question though,
what should a scan for chained ops do?

  list [==] 0, 0, 1, 2, 2;
  # bool::false?
  # (bool::true, bool::true, bool::false, bool::false, bool::false) 
Keeping in mind that the scan will contain the boolean results of the 
comparisons, you'd be comparing 2 with "true" in the later stages of the 
scan. Is that what you intended, or would ~~ be more appropriate?


(And I'm with Smylers on this one: show me a useful example, please.)


Well the above example does tell you where the leading prefix of equal 
values stops, assuming the second answer.


Combined with reduce it gives some interesting results:

[+] list [?&] @bits  ==> index of first zero in bit vector

There are other APLish operators that could be very useful in 
combination with reduce and scan:


the bit vector form of grep (maybe called filter);
filter (1 0 0 1 0 1 1) (1 2 3 4 5 6 7 8) ==> (1 4 6 7)
This is really useful if your selecting out of multiple parallel arrays.
Use hyper compare ops to select what you want followed by using filter 
to prune out the unwanted.


filter gives you with scan:

filter (list [<] @array) @array ==>
first monotonically increasing run in @array

filter (list [<=] @array) @array ==>
first monotonically non-decreasing run in @array

That was 5 minutes of thinking.

Mark Biggar


--
[EMAIL PROTECTED]
[EMAIL PROTECTED]

Re: Scans

2006-05-09 Thread Mark A. Biggar


Markus Laire wrote:


ps. Should first element of scan be 0-argument or 1-argument case.
i.e. should list([+] 1) return (0, 1) or (1)



APL defines it as the later (1).


--
[EMAIL PROTECTED]
[EMAIL PROTECTED]

Re: "normalized" hash-keys

2006-05-09 Thread Dr.Ruud

Larry Wall schreef:
> Dr.Ruud:

>> What would be the way to define-or-set that a specific hash has
>> non-case-sensitive keys?
>
> Use a shaped hash with a key type that defines infix:<===>
> appropriately, since object hashes are based on infix:<===> rather
> than infix:.

Suppose I want the keys to be Normalization-Form-C, and the values to be
regexes,
would this be the way to say that?

  my Regex %hash{ NFC } ;

http://www.unicode.org/notes/tn5/
http://icu.sourceforge.net/
http://www.ibm.com/software/globalization/icu/


>> Or broader: that the keys should be normalized (think NFKC()) before
>> usage?
>
> I think it would be up to the type to generate, transform to and/or
> cache such a canonicalized key.

OK, alternatively "delegate it to the type".


>> Would it be easy to "delegate it to the hash"? (or use a hardly
>> noticeable wrapper)
>
> Probably--just give the hash a shape with a key type that is easily
> coerced from the input types, I suspect.  Hash keys could probably
> afford to do an implicit .as(KeyType) even if the current language
> were to disallow implicit conversions in general.

Maybe not hash keys in general, but only hash keys of a type that needs
it. But wait, even stringification is coercion.

-- 
Affijn, Ruud

"Gewoon is een tijger."

Re: Scans

2006-05-09 Thread Mark A. Biggar


Austin Hastings wrote:

Gaal Yahas wrote:

On Mon, May 08, 2006 at 04:02:35PM -0700, Larry Wall wrote:
 
: I'm probably not thinking hard enough, so if anyone can come up 
with an

: implementation please give it :)  Otherwise, how about we add this to
: the language?

Maybe that's just what reduce operators do in list context.



I love this idea and have implemented it in r10246. One question though,
what should a scan for chained ops do?

  list [==] 0, 0, 1, 2, 2;
  # bool::false?
  # (bool::true, bool::true, bool::false, bool::false, bool::false) 
Keeping in mind that the scan will contain the boolean results of the 
comparisons, you'd be comparing 2 with "true" in the later stages of the 
scan. Is that what you intended, or would ~~ be more appropriate?


(And I'm with Smylers on this one: show me a useful example, please.)


Well the above example does tell you where the leading prefix of equal
values stops, assuming the second answer.

Combined with reduce it gives some interesting results:

[+] list [?&] @bits  ==> index of first zero in bit vector

There are other APLish operators that could be very useful in
combination with reduce and scan:

the bit vector form of grep (maybe called filter);
filter (1 0 0 1 0 1 1) (1 2 3 4 5 6 7 8) ==> (1 4 6 7)
This is really useful if your selecting out of multiple parallel arrays.
Use hyper compare ops to select what you want followed by using filter
to prune out the unwanted.

filter gives you with scan:

filter (list [<] @array) @array ==>
first monotonically increasing run in @array

filter (list [<=] @array) @array ==>
first monotonically non-decreasing run in @array

That was 5 minutes of thinking.

Mark Biggar


--
[EMAIL PROTECTED]
[EMAIL PROTECTED]

Re: Scans

2006-05-09 Thread Jonathan Scott Duff

On Tue, May 09, 2006 at 06:07:26PM +0300, Markus Laire wrote:
> ps. Should first element of scan be 0-argument or 1-argument case.
> i.e. should list([+] 1) return (0, 1) or (1)

I noticed this in earlier posts and thought it odd that anyone
would want to get an extra zero arg that they didn't specify. My
vote would be that list([+] 1) == (1)  just like [+] 1 == 1

-Scott
-- 
Jonathan Scott Duff
[EMAIL PROTECTED]

[svn:perl6-synopsis] r9153 - doc/trunk/design/syn

2006-05-09 Thread larry

Author: larry
Date: Tue May  9 14:06:29 2006
New Revision: 9153

Modified:
   doc/trunk/design/syn/S03.pod

Log:
Reduce in list context.


Modified: doc/trunk/design/syn/S03.pod
==
--- doc/trunk/design/syn/S03.pod(original)
+++ doc/trunk/design/syn/S03.podTue May  9 14:06:29 2006
@@ -12,9 +12,9 @@
 
   Maintainer: Larry Wall <[EMAIL PROTECTED]>
   Date: 8 Mar 2004
-  Last Modified: 1 May 2006
+  Last Modified: 9 May 2006
   Number: 3
-  Version: 26
+  Version: 27
 
 =head1 Changes to existing operators
 
@@ -497,6 +497,15 @@
 @args = (\%a,'foo','bar');
 $x = [dehash] @args;
 
+In scalar context, a reduce operator returns only the final result, but
+in list context, the reduce operator also returns all intermediate results
+lazily:
+
+say [+], 1..*  #  (1, 3, 6, 10, 15, ...)
+
+Unlike other reduction operators, the C<[;]> operator is not sensitive
+to list context.
+
 =head1 Junctive operators
 
 C<|>, C<&>, and C<^> are no longer bitwise operators (see L

Re: Scans

2006-05-09 Thread Smylers

Gaal Yahas writes:

> On Tue, May 09, 2006 at 11:23:48AM +0100, Smylers wrote:
> 
> > So I have the list generated by the scan.  And?  What do I do with
> > it?  I can't think of any situation in my life where I've been
> > wanting such a list.
> 
> Scans are useful when the intermediate results are interesting, as
> well as when you want to cut off a stream once some threshold
> condition is met.

OK, we're getting closer, but that still sounds quite abstract to me.

> item [+] 1 .. 10;   # 10th triangular number
> list [+] 1 .. 10;   # 10 first triangular number
> first { $_ > 42 } [+] 1 ..*  # first triangular number over 42

Same question, but one level further on: why would I want the first 10
triangular numbers, or the first triangular number over 42?

Sorry to keep going on like this, but I'm still struggling to see what
this gets us.  Is wanting to do something like that sufficiently common
in real-life situations?

I've seen other people ask similar questions about features such as
juctions and hyperoperators (and folk were able to come up with suitable
examples), but in those cases there was also response that these are
features which beginners can choose to ignore.

I'd have no particular objection to scans being in Perl 6 if those of us
without sufficient imagination were able to just ignore them, and act
like they don't exist.  But if things that look like reductions
sometimes turn out to be scans then I have to know about them (even if
just to avoid them) anyway.

And I have no problem in thinking of lots of situations where I'd find
reductions handy.  It slightly unnerves me that I suspect some of those
would happen to be in list context -- not because I wanted a list, but
because things like C blocks and C arguments are always lists.

Are scans sufficiently useful that they are worth making reductions more
awkward, and have a higher barrier to entry (since you now have to learn
about both reductions and scans at the same time in order to be able to
use only one of them).

> If you have a series whose sum yields closer and closer approximations
> of some value, you can use a scan to efficiently cut off once some
> epsilon is reached.

OK, I can see why mathematicians and engineers would want to do that.
But that's a specialist field; couldn't this functionality be provided
in a module?  I'm unconvinced that core Perl needs features for closely
approximating mathematicians any more than it needs, say, CGI parameter
parsing or DBI -- they're all niches that some people use lots and
others won't touch at all.

Smylers

Re: Scans

2006-05-09 Thread Smylers

Mark A. Biggar writes:

> Austin Hastings wrote:
> 
> > Gaal Yahas wrote:
> > 
> > >   list [==] 0, 0, 1, 2, 2;
> > >   # bool::false?
> > >   # (bool::true, bool::true, bool::false, bool::false, bool::false) 
> > 
> >(And I'm with Smylers on this one: show me a useful example, please.)
> 
> Well the above example does tell you where the leading prefix of equal
> values stops, assuming the second answer.

But you still have to iterate through the list of C to get that
index -- so you may as well have just iterated through the input list
and examined the values till you found one that differed.

> Combined with reduce it gives some interesting results:
> 
> [+] list [?&] @bits  ==> index of first zero in bit vector

Yer what?  Are you seriously suggesting that as a sane way of finding
the first element of C<@bits> that contains a zero?  That doesn't even
short-cut (since the addition reduction can't know that once it starts
adding on zeros all the remaining values are also going to be zeros).

> There are other APLish operators that could be very useful in
> combination with reduce and scan:

The fact that there are more operators that go with these only adds to
my suspicion that this field of stuff is appropriate for a module, not
the core language.

> the bit vector form of grep (maybe called filter);
>   filter (1 0 0 1 0 1 1) (1 2 3 4 5 6 7 8) ==> (1 4 6 7)

Please don't!  The name 'filter' is far too useful to impose a meaning
as specific as this on it.

Smylers

A rule by any other name...

2006-05-09 Thread Allison Randal


On Apr 20, 2006, at 1:32 PM, Damian Conway wrote:


 KeywordImplicit adverbsBehaviour
  regex (none)  Ignores whitespace, backtracks
  token :ratchetIgnores whitespace, no backtracking
  rule  :ratchet :words Skips whitespace, no backtracking

[...and following threads...]



I'm comfortable with the semantic distinction between 'rule' as "thingy 
inside a grammar" and 'regex' as "thingy outside a grammar". But, I 
think we can find a better name than 'regex'. The problem is both the 
'regex' vs. 'regexp' battle, and the fact that everyone knows 'regex(p)' 
means "regular expression" no matter how may times we say it doesn't. 
(I'm not fond of the idea of spending the next 20 years explaining that 
over and over again.) Maybe 'match' is a better keyword.


Then again, from a practical perspective, it seems likely that we'll 
want something like ":ratchet is set by default in all rules" turned on 
in some grammars and off in other grammars. In which case, the real 
distinction is that rules inside a grammar pull default attributes from 
their grammar class, while rules outside a grammar have no default 
attributes. Which brings us back to a single keyword 'rule' making sense 
for both.



I'm not comfortable with the semantic distinction between 'rule' and 
'token'. Whitespace skipping is not the defining difference between a 
rule and a token in general use of the terms, so the names are misleading.


More importantly, whitespace skipping isn't a very significant option in 
grammars in general, so creating two keywords that distinguish between 
skipping and no skipping is linguistically infelicitous. It's like 
creating two different words for "shirts with horizontal stripes" and 
"shirts with vertical stripes". Sure, they're different, but the 
difference isn't particularly significant, so it's better expressed by a 
modifier on "shirt" than by a different word.


From a practical perspective, both the Perl 6 and Punie grammars have 
ended up using 'token' in many places (for things that aren't tokens), 
because :words isn't really the semantics you want for parsing computer 
languages. (Though it is quite useful for parsing natural language and 
other things.) What you want is comment skipping, which isn't the same 
as :words.


I suggest making whitespace skipping a default setting on the grammar 
class, so the grammars that need whitespace skipping most of the time 
can turn it on by default for their rules. That means 'token' and 'rule' 
collapse into just 'rule'.


I also suggest a new modifier for comment skipping (or skipping in 
general) that's separate from :words, with semantics much closer to 
Parse::RecDescent's 'skip'.


Allison

Re: A rule by any other name...

2006-05-09 Thread James Mastros

On Tue, May 09, 2006 at 04:51:17PM -0700, Allison Randal wrote:
> I'm comfortable with the semantic distinction between 'rule' as "thingy 
> inside a grammar" and 'regex' as "thingy outside a grammar". But, I 
> think we can find a better name than 'regex'.  
[...]
> Maybe 'match' is a better keyword.
Can I suggest we keep match meaning thing you get when you run a thingy
against a string, and make "matcher" be the thingy that gets run?

100% agree with you, Allison; thanks for putting words to "doesn't feel
right".

   -=- James Mastros

Re: Scans

2006-05-09 Thread Austin Hastings


Mark A. Biggar wrote:

Austin Hastings wrote:

Gaal Yahas wrote:

On Mon, May 08, 2006 at 04:02:35PM -0700, Larry Wall wrote:
 
: I'm probably not thinking hard enough, so if anyone can come up 
with an
: implementation please give it :)  Otherwise, how about we add 
this to

: the language?

Maybe that's just what reduce operators do in list context.



I love this idea and have implemented it in r10246. One question 
though,

what should a scan for chained ops do?

  list [==] 0, 0, 1, 2, 2;
  # bool::false?
  # (bool::true, bool::true, bool::false, bool::false, bool::false) 
Keeping in mind that the scan will contain the boolean results of the 
comparisons, you'd be comparing 2 with "true" in the later stages of 
the scan. Is that what you intended, or would ~~ be more appropriate?


(And I'm with Smylers on this one: show me a useful example, please.)


Well the above example does tell you where the leading prefix of equal 
values stops, assuming the second answer.



That's a long way to go...

Combined with reduce it gives some interesting results:

[+] list [?&] @bits  ==> index of first zero in bit vector


Likely to win the obfuscated Perl contest, but ...?
There are other APLish operators that could be very useful in 
combination with reduce and scan:


the bit vector form of grep (maybe called filter);
filter (1 0 0 1 0 1 1) (1 2 3 4 5 6 7 8) ==> (1 4 6 7)
This is really useful if your selecting out of multiple parallel arrays.


Okay, this begins to approach the land of useful. If there's a 
faster/better/stronger way to do array or hash slices, I'm interested. 
But the approach above doesn't seem to be it.
Use hyper compare ops to select what you want followed by using filter 
to prune out the unwanted.


filter gives you with scan:

filter (list [<] @array) @array ==>
first monotonically increasing run in @array

This seems false. @array = (1 2 2 1 2 3), if I understand you correctly, 
yields (1 2 2 3).



filter (list [<=] @array) @array ==>
first monotonically non-decreasing run in @array


So @array = (1 0 -1 -2 -1 -3) ==> (1, -1) is monotonically non-decreasing?

That was 5 minutes of thinking.


I'm thinking that APL is dead for a reason. And that every language 
designer in the world has had a chance to pick over its dessicated 
bones: all the good stuff has been stolen already. So while "scans" may 
fall out as a potential side-effect of reduce, the real question should 
be "are 'scans' useful enough to justify introducing context sensitivity 
to the reduce operation?"


=Austin

Re: A rule by any other name...

2006-05-09 Thread Damian Conway


Allison wrote:


I'm comfortable with the semantic distinction between 'rule' as "thingy
inside a grammar" and 'regex' as "thingy outside a grammar". But, I
think we can find a better name than 'regex'. The problem is both the
'regex' vs. 'regexp' battle,


Is that really an issue? I've never met anyone who *voluntarily* added
the 'p'. ;-)


 and the fact that everyone knows 'regex(p)'
means "regular expression" no matter how may times we say it doesn't.


Sure. But almost nobody knows what "regular" actually means, and of
those few only a tiny number of pedants actually *care* anymore. So
does it matter?



(I'm not fond of the idea of spending the next 20 years explaining that
over and over again.)


Then don't. I teach regexes all the time and I *never* explain what
"regular" means, or why it doesn't apply to Perl (or any other
commonly used) regexes any more.



Maybe 'match' is a better keyword.


I don't think so. "Match" is a better word for what comes back from
a regex match (what we currently refer to as a Capture, which is
okay too).



Then again, from a practical perspective, it seems likely that we'll
want something like ":ratchet is set by default in all rules" turned on
in some grammars and off in other grammars. In which case, the real
distinction is that rules inside a grammar pull default attributes from
their grammar class, while rules outside a grammar have no default
attributes. Which brings us back to a single keyword 'rule' making sense
for both.


That's pretty much the Pelr 5 argument for using "sub" for both subroutines
and methods, which we've definitively rejected in Perl 6. If we use
"rule" for both kinds of regexes, we force the reader to constantly
check surrounding context in order to understand the behaviour of the
construct. :-(



I'm not comfortable with the semantic distinction between 'rule' and
'token'. Whitespace skipping is not the defining difference between a
rule and a token in general use of the terms, so the names are misleading.


True. "Token" is the wrong word for another reason: a token is a
segments component of the input stream, *not* a rule for matching
segmented components of the input stream. The correct term for that is
"terminal". So a suitable keyword might well be "term".

However, terminals do differ from rules in that they do not attempt to
be smart about what they ignore.



More importantly, whitespace skipping isn't a very significant option in
grammars in general, so creating two keywords that distinguish between
skipping and no skipping is linguistically infelicitous. It's like
creating two different words for "shirts with horizontal stripes" and
"shirts with vertical stripes". Sure, they're different, but the
difference isn't particularly significant, so it's better expressed by a
modifier on "shirt" than by a different word.


I'd *strongly* disagree with that. Whitespace skipping (for suitable
values of "whitespace") is a critical feature of parsers. I'd go so far
as to say that it's *the* killer feature of Parse::RecDescent.



 From a practical perspective, both the Perl 6 and Punie grammars have
ended up using 'token' in many places (for things that aren't tokens),
because :words isn't really the semantics you want for parsing computer
languages. (Though it is quite useful for parsing natural language and
other things.) What you want is comment skipping, which isn't the same
as :words.


What you want is *whitespace* skipping (where comments are a special
form of whitespace). What you *really* want is is whitespace skipping
where you get to define what constitutes whitespace in each context
where whitespace might be skipped.

But the defining characteristic of a "terminal" is that you try to match
it exactly, without being smart about what to ignore. That's why I like the
fundamental rule/token distinction as it is currently specified.



I also suggest a new modifier for comment skipping (or skipping in
general) that's separate from :words, with semantics much closer to
Parse::RecDescent's 'skip'.


Note, however, that the recursive nature of Parse::RecDescent's 
directive is a profound nuisance in practice, because you have to
remember to turn it off in every one of the terminals.


In light of all that, perhaps :words could become :skip, which defaults to
:skip(//) but allows you to specify :skip(/whatever/).

As for the keywords and behaviour, I think the right set is:


   Default   Default
KeywordWhere BacktrackingSkipping

 regex anywhere   :!ratchet  :!skip
  rule grammars   :ratchet   :skip
  term grammars   :ratchet   :!skip

I do agree that a rule should inherit properties from its grammar, so
you can write:

   grammar Perl6 is skip(/[+ | \#  | \# \N]+/) {
   ...
   }

to allow your grammar to redefine in one place what its rules skip.

Damian

Re: Scans

2006-05-09 Thread Damian Conway


Austin Hastings wrote:


I'm thinking that APL is dead for a reason. And that every language
designer in the world has had a chance to pick over its dessicated
bones: all the good stuff has been stolen already. So while "scans" may
fall out as a potential side-effect of reduce, the real question should
be "are 'scans' useful enough to justify introducing context sensitivity
to the reduce operation?"


Amen!

Damian

Re: A rule by any other name...

2006-05-09 Thread Audrey Tang

Allison Randal wrote:
> More importantly, whitespace skipping isn't a very significant option in
> grammars in general, so creating two keywords that distinguish between
> skipping and no skipping is linguistically infelicitous. It's like
> creating two different words for "shirts with horizontal stripes" and
> "shirts with vertical stripes". Sure, they're different, but the
> difference isn't particularly significant, so it's better expressed by a
> modifier on "shirt" than by a different word.

This is not only "space" skipping; as we discussed,  skips over
comments as well as spaces, because a language (such as Perl 6) can
defined its own  that serves as valid separator. To wit:

void main () {}
void/* this also works */main () {}

Or, in Perl 6:

say time;
say#( this also works )time;

> From a practical perspective, both the Perl 6 and Punie grammars have
> ended up using 'token' in many places (for things that aren't tokens),
> because :words isn't really the semantics you want for parsing computer
> languages. (Though it is quite useful for parsing natural language and
> other things.) What you want is comment skipping, which isn't the same
> as :words.

Currently it's defined, and used, the same as :words.

I think the confusion arises from  being read as "whitespace"
instead of as "word separator".  Maybe an explicit  can fix
that, or maybe rename it to something else, but the token/rule
distinction of :words is very useful, because it's more usual for
languages to behave like C and Perl 6, instead of:

ex/* this calls exit */it();

which is rarer, and can be treated with separate "token" rules than .

Audrey

signature.asc
Description: OpenPGP digital signature

Re: Scans

2006-05-09 Thread Austin Hastings


Smylers wrote:

Mark A. Biggar writes:
  

Austin Hastings wrote:


Gaal Yahas wrote:
  

  list [==] 0, 0, 1, 2, 2;
  # bool::false?
  # (bool::true, bool::true, bool::false, bool::false, bool::false) 


(And I'm with Smylers on this one: show me a useful example, please.)
  

Well the above example does tell you where the leading prefix of equal
values stops, assuming the second answer.



But you still have to iterate through the list of C to get that
index -- so you may as well have just iterated through the input list
and examined the values till you found one that differed.
  


I think the one thing that is redeeming scans in this case is the (my?) 
assumption that they are automatically lazy.


The downside is that they aren't random-access, at least not in 6.0. I 
expect that


 @scan ::= list [==] @array;
 say @scan[12];

will have to perform all the compares, since it probably won't be smart 
enough to know that == doesn't accumulate.


So yes, you iterate over the scan until you find whatever you're looking 
for. Then you stop searching. If you can't stop (because you're using 
some other listop) that could hurt.


At the most useful, it's a clever syntax for doing a map() that can 
compare predecessor with present value. I think that's a far better 
angle than any APL clonage. But because it's a side-effect of reduce, it 
would have to be coded using "$b but true" to support the next operation:


 sub find_insert($a, $b, $new) {
 my $insert_here = (defined($b)
 ? ($a <= $new < $b)
 : $new < $a);
 return $b but $insert_here;
 }

Then :

 sub list_insert($x) {
 &ins := &find_insert.assuming($new => $x);
 @.list.splice(first([&ins] @array).k, 0, $x);
 }

It's a safe bet I've blown the syntax. :(

I think I'm more enthusiastic for a pairwise traversal (map2 anyone?) 
than for scan. But I *know* map2 belongs in a module. :)



the bit vector form of grep (maybe called filter);
filter (1 0 0 1 0 1 1) (1 2 3 4 5 6 7 8) ==> (1 4 6 7)



Please don't!  The name 'filter' is far too useful to impose a meaning
as specific as this on it.
  

Hear, hear! Ixnay on the ilterfay.

=Austin

[PATCH] S02 - add grammar / rule info to sigil list

2006-05-09 Thread jerry gay


i noticed a few things missing from the list of sigils. patch inline below.
~jerry

Index: design/syn/S02.pod
===
--- design/syn/S02.pod  (revision 9154)
+++ design/syn/S02.pod  (working copy)
@@ -494,8 +494,8 @@
$   scalar
@   ordered array
%   unordered hash (associative array)
-&   code
-::  package/module/class/role/subset/enum/type
+&   code/rule/token/regex
+::  package/module/class/role/subset/enum/type/grammar
@@  multislice view of @

Within a declaration, the C<&> sigil also declares the visibility of the

[svn:perl6-synopsis] r9156 - doc/trunk/design/syn

2006-05-09 Thread larry

Author: larry
Date: Tue May  9 21:26:12 2006
New Revision: 9156

Modified:
   doc/trunk/design/syn/S02.pod

Log:
patch from jerry++.


Modified: doc/trunk/design/syn/S02.pod
==
--- doc/trunk/design/syn/S02.pod(original)
+++ doc/trunk/design/syn/S02.podTue May  9 21:26:12 2006
@@ -494,8 +494,8 @@
 $   scalar
 @   ordered array
 %   unordered hash (associative array)
-&   code
-::  package/module/class/role/subset/enum/type
+&   code/rule/token/regex
+::  package/module/class/role/subset/enum/type/grammar
 @@  multislice view of @
 
 Within a declaration, the C<&> sigil also declares the visibility of the

the 'postfix:::' operator

2006-05-09 Thread jerry gay


that's postfix ::, as mentioned in the Names section of S02.


There is no longer any special package hash such as %Foo::. Just
subscript the package object itself as a hash object, the key of which
is the variable name, including any sigil. The package object can be
derived from a type name by use of the :: postfix operator:

   MyType::<$foo>


i don't see it anywhere in S03. probably should be, if it indeed
exists, method postfix.
~jerry

S02: generalized quotes and adverbs

2006-05-09 Thread jerry gay


according to S02, under 'Literals', generalized quotes may now take
adverbs. in that section is the following comment:


[Conjectural: Ordinarily the colon is required on adverbs, but the
"quote" declarator allows you to combine any of the existing adverbial
forms above without an intervening colon:

   quote qw;   # declare a P5-esque qw//


there's trouble if both q (:single) and qq (:double) are allowed
together. how would qqq resolve? i say it makes sense that we get
longest-token matching first, which means it translates to :double
followed by :single.

~jerry

Re: S02: generalized quotes and adverbs

2006-05-09 Thread Larry Wall

On Tue, May 09, 2006 at 11:15:24PM -0700, jerry gay wrote:
: according to S02, under 'Literals', generalized quotes may now take
: adverbs. in that section is the following comment:
: 
: 
: [Conjectural: Ordinarily the colon is required on adverbs, but the
: "quote" declarator allows you to combine any of the existing adverbial
: forms above without an intervening colon:
: 
:quote qw;   # declare a P5-esque qw//
: 
: 
: there's trouble if both q (:single) and qq (:double) are allowed
: together. how would qqq resolve? i say it makes sense that we get
: longest-token matching first, which means it translates to :double
: followed by :single.

That would be one way to handle it.  I'm not entirely convinced that
we have the right adverb set yet though.  I'm still thinking about
turning :n, :q, and :qq into :0, :1, and :2.  I'd like to turn :ww
into something single character as well.  The doubled ones bother me
just a little.

But as it stands, the conjectured quote declarator is kind of lame.
It'd be just about as easy to allow

quote qX :x :y :z;

so you could alias it any way you like.  Or possibly just allow

alias qX "q:x:y:z";

or even

qX ::= "q:x:y:z";

as a simple, argumentless "word" macro.  But the relationship
of that to "real" macros would have to be evaluated.  There's
something to be said for keeping macros a little bit klunky.
On the other hand, if people are going to invent simplified
macro syntax anyway, I'd rather there be some standards.

Larry

Re: Scans

Re: Scans

Re: Scans

Re: Scans

Re: Scans

Re: Scans

Re: Scans

Re: "normalized" hash-keys

Re: Scans

Re: Scans

Re: "normalized" hash-keys

Re: Scans

Re: Scans

[svn:perl6-synopsis] r9153 - doc/trunk/design/syn

Re: Scans

Re: Scans

A rule by any other name...

Re: A rule by any other name...

Re: Scans

Re: A rule by any other name...

Re: Scans

Re: A rule by any other name...

Re: Scans

[PATCH] S02 - add grammar / rule info to sigil list

[svn:perl6-synopsis] r9156 - doc/trunk/design/syn

the 'postfix:::' operator

S02: generalized quotes and adverbs

Re: S02: generalized quotes and adverbs

28 matches

Site Navigation

Mail list logo

Footer information