Re: dis-junctive patterns

2005-11-22 Thread TSa

HaloO,

Gaal Yahas wrote:

In pugs, r7961:

 my @pats = /1/, /2/;
 say MATCH if 1 ~~ any @pats; # MATCH
 say MATCH if 0 ~~ any @pats; # no match

So far so good. But:

 my $junc = any @pats;
 say MATCH if 1 ~~ $junc; # no match
 say MATCH if 0 ~~ $junc; # no match

Bug? Feature?


Ohh, interesting. This reminds me to my proposal
that junctions are code types and exert their magic
only when recognized as such. The any(@pats) form
constructs such a code object right in the match while
the $junc var hides it. My idea was to explicitly
request a code evaluation by one of

  my junc = any @pats; # 1: use code sigil
  say MATCH if 1 ~~ junc;

  say MATCH if 1 ~~ do $junc; # 2: do operator

  say MATCH if 1 ~~ $junc();  # 3: call operator

But this might just be wishful thinking on my side.
--


Re: apo5

2005-11-22 Thread Patrick R. Michaud
On Mon, Nov 21, 2005 at 12:08:08PM -0800, Larry Wall wrote:
 On Mon, Nov 21, 2005 at 07:57:59PM +0100, Ruud H.G. van Tol wrote:
 : There is a [[:alpha:][:digit:] and a [[:alpha:][:digit]] on the
 : A5-page.
 
 Hmm, well, thanks--I went to fix it and I see Patrick beat me to
 the fix.  But in one of the updates, it says:
 
 +[Update: Actually, that's now written C +alpha+digit , avoiding
 +the mistaken impression entirely.]

I went ahead and added the update while fixing the typos.  :-)

 And it occurs to me that we could probably allow alpha+digit there
 since there's no ambiguity what alpha means, and we're already claiming
 the next character after the opening word to decide how to process the
 rest of the text inside angles.  Even if someone writes
 
 alpha + digit
 
 that would fail under the current policy of treating + digit as rule,
 since you can't start a rule with +.

Somehow I prefer the explicit leading + or -, so that we *know* this
is a rule composition of some sort.  It also fits in well with the
convention that the first character after the '' lets you know
what kind of assertion is being created.

 Unfortunately, though,
 
 identchar - digit
 
 would be ambiguous, and/or wrong.  Could allow whitespace there if we
 picked an explicit this is rule character.  Did we remove this is
 string?  

I didn't recall seeing anything that removed this is string, so it's
currently implemented in PGE.  It's kind of a nice shortcut:

bracketed: []()

but it would be no real problem to eliminate it and go
strictly with:

bracketed('[]()')

This is rule is currently whitespace, whatever follows is taken to be
a pattern.

But let me know what you decide so I can make the appropriate
changes.  :-)

Pm


Re: \x{123a 123b 123c}

2005-11-22 Thread Patrick R. Michaud
On Mon, Nov 21, 2005 at 09:02:57AM -0800, Larry Wall wrote:
 : There's also sp, unless someone redefines the sp subrule.
 
 But you can't use sp in a character class.  Well, that is, unless
 you write it:
 
 +[ a..z ]+sp
 
 or some such.  Maybe that's good enough.

Er, that's now +[ a..z ]+sp, unless you're now changing it back.

 : And in the general case that's a slightly more expensive mechanism 
 : to get a space (it involves at least a subrule lookup).  Perhaps 
 : we could also create a visible meta sequence for it, in the same 
 : way that we have visible metas for \e, \f, \r, \t.  But I have 
 : no idea what letter we might use there.
 
 Something to be said for \_ in that regard.

Yes, I thought of \_ but mentally I still have trouble 
classifying _ along with the alphabetics -- '_' looks more
like punctuation to me.  And in general we use backslashes
in front of metacharacters to remove their meta meaning
(or when we aren't sure if a character has a meta meaning),
so that \_ somehow seems like it ought to be a literal
underscore, guarding against the possibility that the unescaped
underscore has a meta meaning.  (And yes, I can shoot
holes in this line of thinking along with everyone else.)

Whatever shortcuts we introduce, I'll be happy if we can just
rule that backslash+space (i.e., \ ) is a literal space
character -- i.e., keeping the principle that placing a backslash
in front of a metacharacter removes that character's meta
behavior.

 I dunno.  If «...» in ordinary code does shell quoting, maybe «...» in
 rules does filename globbing or some such.  I can see some issues with
 anchoring semantics.  Makes more sense on a string as a whole, but maybe
 can anchor on element boundaries if used on a list of filenames.
 I suppose one could even go as far as
 
 rule jpeg :i « *.jp{e,}g »
 
 or whatever the right glob syntax is.

Since we already have :perl5, I'd think that we'd want globbing 
to be something like

rule jpeg :i :glob /*.jp{e,}g/

or, for something intra-rule-ish:

m :w / mv (:glob *.c)+ dir /

And perhaps we'd want a general form for specifying other 
pattern syntaxes; i.e., :perl5 and :glob are shortcuts for
:syntax('perl5') and :syntax('glob') or something like that.

Pm


Re: apo5

2005-11-22 Thread Patrick R. Michaud
On Mon, Nov 21, 2005 at 07:57:59PM +0100, Ruud H.G. van Tol wrote:
 
 There is a [[:alpha:][:digit:] and a [[:alpha:][:digit]] on the
 A5-page.

Now fixed.

  Besides, you have to be able to distinguish
  s/^/foo/ from s/$/foo/.
 
 's/$/foo/' becomes 's/after .*/foo/'
 g

Uh, no, because after is still a zero width assertion.  :-)

Pm


Re: apo5

2005-11-22 Thread Patrick R. Michaud
On Mon, Nov 21, 2005 at 11:19:48PM +0100, Ruud H.G. van Tol wrote:
 Patrick R. Michaud:
 
  's/$/foo/' becomes 's/after .*/foo/'
  g
  
  Uh, no, because after is still a zero width assertion.  :-)
 
 That's why I chose it. It is not at the end-of-string?

Because .* matches , /after .*/ would be true at 
every position in the string, including the beginning,
and this is where foo would be substituted.  

Pm


Re: apo5

2005-11-22 Thread Patrick R. Michaud
On Tue, Nov 22, 2005 at 01:09:40AM +0100, Ruud H.G. van Tol wrote:
  's/$/foo/' becomes 's/after .*/foo/'
 
  Uh, no, because after is still a zero width assertion.  :-)
 
  That's why I chose it. It is not at the end-of-string?
 
  Because .* matches , /after .*/ would be true at
  every position in the string, including the beginning,
  and this is where foo would be substituted.
 
 I expected greediness, also because after .*? could behave non-greedy.
 ...
 But why does after .* behave non-greedy?

I think you may be misreading what after .* does -- it's a lookbehind
assertion.  An assertion such as after pattern attempts to match
pattern to the sequence immediately preceding the current match position.
It does not mean skip over pattern and then match whatever comes
afterwards.

The greediness of the .* subpattern in after .* doesn't affect
things at all -- after .* is still a zero-width assertion.
Since .* can match at every position, after .* will be
a successful zero-width match (i.e., a null string) at every
position in the target string, including the beginning.

So, s/after .*/foo/  matches the first null string it finds 
-- the one at the beginning of the string -- and replaces it 
with foo.  It's the same as if you had written s/null/foo/,
since after .* and null will both end up matching exactly
the same (i.e., a zero-width string at any position).

If this still doesn't make any sense, contact me off-list and
I'll try and explain it there.

Pm


Re: statement_controlfoo() (was Re: lvalue reverse and array views)

2005-11-22 Thread Michele Dondi

On Mon, 21 Nov 2005, Larry Wall wrote:


I would like to publicly apologize for my remarks, which were far too
harsh for the circumstances.  I can only plead that I was trying to
be far too clever, and not thinking about how it would come across.
No, to be perfectly honest, it was more culpable than that.  I had
a niggling feeling I was being naughty, and I ignored it.  Shame on me.
I will try to pay better attention to my conscience in the future.


Oh, I'm not the person you were responding to, and probably the less 
entitled one to speak in the name of everyone else here, but I feel like 
doing so to say that in all earnestness I'm quite sure no one took any 
offense out of your words. Despite the slight harshness, they're above all 
witty. Just as usual: and that's the style we all like!



Michele
--
La vita e' come una scatola di cioccolatini:
un regalo banale.
- scritta su un muro, V.le Sabotino - Milano.


Re: \x{123a 123b 123c}

2005-11-22 Thread Larry Wall
On Mon, Nov 21, 2005 at 11:25:20AM -0600, Patrick R. Michaud wrote:
: On Mon, Nov 21, 2005 at 09:02:57AM -0800, Larry Wall wrote:
:  : There's also sp, unless someone redefines the sp subrule.
:  
:  But you can't use sp in a character class.  Well, that is, unless
:  you write it:
:  
:  +[ a..z ]+sp
:  
:  or some such.  Maybe that's good enough.
: 
: Er, that's now +[ a..z ]+sp, unless you're now changing it back.

No, just me going senile.

:  : And in the general case that's a slightly more expensive mechanism 
:  : to get a space (it involves at least a subrule lookup).  Perhaps 
:  : we could also create a visible meta sequence for it, in the same 
:  : way that we have visible metas for \e, \f, \r, \t.  But I have 
:  : no idea what letter we might use there.
:  
:  Something to be said for \_ in that regard.
: 
: Yes, I thought of \_ but mentally I still have trouble 
: classifying _ along with the alphabetics -- '_' looks more
: like punctuation to me.  And in general we use backslashes
: in front of metacharacters to remove their meta meaning
: (or when we aren't sure if a character has a meta meaning),
: so that \_ somehow seems like it ought to be a literal
: underscore, guarding against the possibility that the unescaped
: underscore has a meta meaning.  (And yes, I can shoot
: holes in this line of thinking along with everyone else.)

I think we'll leave both _ and \_ meaning the same thing, just to avoid
that confusion path--I've seen people backwhacking anything remotely
resembling punctuation just in case it's a metacharacter, and if they
are confused about _, they might backwhack it.  More to the point,
I think sp and +sp are about the right Huffman length, given that
matching a single space is usually wrong.  You usually want \s or \s*.

: Whatever shortcuts we introduce, I'll be happy if we can just
: rule that backslash+space (i.e., \ ) is a literal space
: character -- i.e., keeping the principle that placing a backslash
: in front of a metacharacter removes that character's meta
: behavior.

Yes, that will be a space.

:  I dunno.  If «...» in ordinary code does shell quoting, maybe «...» in
:  rules does filename globbing or some such.  I can see some issues with
:  anchoring semantics.  Makes more sense on a string as a whole, but maybe
:  can anchor on element boundaries if used on a list of filenames.
:  I suppose one could even go as far as
:  
:  rule jpeg :i « *.jp{e,}g »
:  
:  or whatever the right glob syntax is.
: 
: Since we already have :perl5, I'd think that we'd want globbing 
: to be something like
: 
: rule jpeg :i :glob /*.jp{e,}g/
: 
: or, for something intra-rule-ish:
: 
: m :w / mv (:glob *.c)+ dir /

Yep, that's what I decided in my other message that was thinking about
using  ...  for word boundaries and  ...  for capturing $.

: And perhaps we'd want a general form for specifying other 
: pattern syntaxes; i.e., :perl5 and :glob are shortcuts for
: :syntax('perl5') and :syntax('glob') or something like that.

Maybe.  Or maybe it's enough that there are syntactic categories for
adding rule modifiers.  Doesn't seem like you'd want to parameterize
the current language very often.

Larry


Re: \x{123a 123b 123c}

2005-11-22 Thread Patrick R. Michaud
On Tue, Nov 22, 2005 at 07:52:24AM -0800, Larry Wall wrote:
 
 I think we'll leave both _ and \_ meaning the same thing, just to avoid
 that confusion path [...]

Yay!

 : Whatever shortcuts we introduce, I'll be happy if we can just
 : rule that backslash+space (i.e., \ ) is a literal space
 : character -- i.e., keeping the principle that placing a backslash
 : in front of a metacharacter removes that character's meta
 : behavior.
 
 Yes, that will be a space.

Yay!

 : Since we already have :perl5, I'd think that we'd want globbing 
 : to be something like
 : rule jpeg :i :glob /*.jp{e,}g/
 : or, for something intra-rule-ish:
 : m :w / mv (:glob *.c)+ dir /
 
 Yep, that's what I decided in my other message that was thinking about
 using  ...  for word boundaries and  ...  for capturing $.

Yay! (Our messages on this crossed in the mail; mine was moderated for
some reason but that's been corrected.)

 : And perhaps we'd want a general form for specifying other 
 : pattern syntaxes; i.e., :perl5 and :glob are shortcuts for
 : :syntax('perl5') and :syntax('glob') or something like that.
 
 Maybe.  Or maybe it's enough that there are syntactic categories for
 adding rule modifiers.  Doesn't seem like you'd want to parameterize
 the current language very often.

At least within PGE, I'm starting to come across the situation
where each application and host language wants its own slight variations
of the regular expression syntax (for compatibility reasons).
And I figured that since we (conjecturally) have C:lang('PIR'), 
C:lang('Python') and C:lang('TCL') to indicate the language 
to be used for the closures within a rule, it might be nice to 
have a similar parameterized modifier for the pattern syntax
itself.

I was also thinking that one of the tricky parts to custom rule
modifiers such as :perl and :glob is that they actually change
the parsing for whatever follows, so it might be nice to have
a parameterized form to hook into rather than defining a custom
modifier for each syntax variant.  But on thinking about it 
further from an implementation perspective I guess it all comes 
out the same anyway...

Pm


Re: statement_controlfoo() (was Re: lvalue reverse and array views)

2005-11-22 Thread Larry Wall
On Tue, Nov 22, 2005 at 10:12:00AM +0100, Michele Dondi wrote:
: Oh, I'm not the person you were responding to, and probably the less 
: entitled one to speak in the name of everyone else here, but I feel like 
: doing so to say that in all earnestness I'm quite sure no one took any 
: offense out of your words. Despite the slight harshness, they're above all 
: witty. Just as usual: and that's the style we all like!

I like witty sayings as much as the next guy, but wit can hurt when
misdirected.  If people want me to be machine for cranking out quote
file fodder, I'll do my best.  But I also care about my friends.

Larry


Re: \x{123a 123b 123c}

2005-11-22 Thread Damian Conway

Patrick wrote:

Since we already have :perl5, I'd think that we'd want globbing 
to be something like


rule jpeg :i :glob /*.jp{e,}g/

or, for something intra-rule-ish:

m :w / mv (:glob *.c)+ dir /


Here! Here!

And perhaps we'd want a general form for specifying other 
pattern syntaxes; i.e., :perl5 and :glob are shortcuts for

:syntax('perl5') and :syntax('glob') or something like that.


Agreed.

Damian


Re: \x{123a 123b 123c}

2005-11-22 Thread Larry Wall
On Tue, Nov 22, 2005 at 08:19:04PM +1100, Damian Conway wrote:
: And perhaps we'd want a general form for specifying other 
: pattern syntaxes; i.e., :perl5 and :glob are shortcuts for
: :syntax('perl5') and :syntax('glob') or something like that.
: 
: Agreed.

But the language in the following lexical scope is a constant, so what can
:syntax($foo) possibly mean?  [Wait, this is Damian I'm talking to.]
Nevermind, don't answer that...

And there aren't that many regexish languages anyway.  So I think :syntax
is relatively useless except for documentation, and in practice people
will almost always omit it, which makes it even less useful, and pretty
nearly kicks it over into the category of multiplied entities for me.

Larry


Re: \x{123a 123b 123c}

2005-11-22 Thread Dave Whipp

Larry Wall wrote:


And there aren't that many regexish languages anyway.  So I think :syntax
is relatively useless except for documentation, and in practice people
will almost always omit it, which makes it even less useful, and pretty
nearly kicks it over into the category of multiplied entities for me.


Its surprising how many are out there. Even if we ignore the various 
dialects of standard rexen, we can find interesting examples such as 
PSL, a language for specifying temporal assertions, for hardware design: 
http://www.project-veripage.com/psl_tutorial_5.php. Whether one would 
want to fold this syntax into a Crule is a different question.


There are actually a number of competing languages in this space. E.g. 
http://www.pslsugar.org/papers/pslandsva.pdf.


Re: \x{123a 123b 123c}

2005-11-22 Thread Larry Wall
On Tue, Nov 22, 2005 at 09:46:59AM -0800, Dave Whipp wrote:
: Larry Wall wrote:
: 
: And there aren't that many regexish languages anyway.  So I think :syntax
: is relatively useless except for documentation, and in practice people
: will almost always omit it, which makes it even less useful, and pretty
: nearly kicks it over into the category of multiplied entities for me.
: 
: Its surprising how many are out there.

We can certainly add a :syntax() modifier as easily as a :foolang modifier,
if we decide at some point we really need one, or if PGE could make good
use of it even if Perl 6 doesn't want it.

Larry


Re: \x{123a 123b 123c}

2005-11-22 Thread Patrick R. Michaud
On Tue, Nov 22, 2005 at 10:30:20AM -0800, Larry Wall wrote:
 On Tue, Nov 22, 2005 at 09:46:59AM -0800, Dave Whipp wrote:
 : Larry Wall wrote:
 : 
 : And there aren't that many regexish languages anyway.  So I think :syntax
 : is relatively useless except for documentation, and in practice people
 : will almost always omit it, which makes it even less useful, and pretty
 : nearly kicks it over into the category of multiplied entities for me.
 : 
 : Its surprising how many are out there.
 
 We can certainly add a :syntax() modifier as easily as a :foolang modifier,
 if we decide at some point we really need one, or if PGE could make good
 use of it even if Perl 6 doesn't want it.

I'm agreeing with Larry on this one -- let's wait to decide this 
until we actually feel like we need it.

Pm


Re: \x{123a 123b 123c}

2005-11-22 Thread Patrick R. Michaud
On Mon, Nov 21, 2005 at 09:02:57AM -0800, Larry Wall wrote:
 On Sun, Nov 20, 2005 at 10:27:17AM -0600, Patrick R. Michaud wrote:
 : On Sat, Nov 19, 2005 at 06:32:17PM -0800, Larry Wall wrote:
 :  We already have, from A5, \x[0a;0d], so you can supposedly say 
 :  \x[123a;123b;123c] 
 : 
 : Hmm, I hadn't caught that particular syntax in A05.  AFAIK it's not 
 : in S05, so I should probably add it, or whatever syntax we end up 
 : adopting.
 
 Yes.

Out of curiosity (and so I can update S05 and PGE), what syntax 
are we adopting?  Is it semicolon, comma, space, any combination of the 
three, or ...?

Pm


Re: \x{123a 123b 123c}

2005-11-22 Thread Larry Wall
On Tue, Nov 22, 2005 at 12:48:39PM -0600, Patrick R. Michaud wrote:
: On Mon, Nov 21, 2005 at 09:02:57AM -0800, Larry Wall wrote:
:  On Sun, Nov 20, 2005 at 10:27:17AM -0600, Patrick R. Michaud wrote:
:  : On Sat, Nov 19, 2005 at 06:32:17PM -0800, Larry Wall wrote:
:  :  We already have, from A5, \x[0a;0d], so you can supposedly say 
:  :  \x[123a;123b;123c] 
:  : 
:  : Hmm, I hadn't caught that particular syntax in A05.  AFAIK it's not 
:  : in S05, so I should probably add it, or whatever syntax we end up 
:  : adopting.
:  
:  Yes.
: 
: Out of curiosity (and so I can update S05 and PGE), what syntax 
: are we adopting?  Is it semicolon, comma, space, any combination of the 
: three, or ...?

S02.pod currently has it as comma.

Larry


Re: dis-junctive patterns

2005-11-22 Thread Larry Wall
On Tue, Nov 22, 2005 at 09:31:27AM +0200, Gaal Yahas wrote:
: In pugs, r7961:
: 
:  my @pats = /1/, /2/;
:  say MATCH if 1 ~~ any @pats; # MATCH
:  say MATCH if 0 ~~ any @pats; # no match
: 
: So far so good. But:
: 
:  my $junc = any @pats;
:  say MATCH if 1 ~~ $junc; # no match
:  say MATCH if 0 ~~ $junc; # no match
: 
: Bug? Feature?

Feels like a bug to me.  The junction should autothread the ~~ even if ~~
weren't dwimmy.  And ~~ ought to be dwimmy about junctions even if they
didn't autothread.  Maybe they're just doing the hallway dance.

Larry


Re: syntax for accessing multiple versions of a module

2005-11-22 Thread Nicholas Clark
On Tue, Oct 18, 2005 at 07:38:19PM -0400, Stevan Little wrote:

 I have been meaning to do some kind of p5 prototype of this, I can  
 push it up the TODO list if it would help you.

As you can probably infer from the amount of time that it has taken for me
to realise that I've failed to reply to you, I think that I already have
rather too much going on to be able to take advantage of anything in the
near future. So thanks for the offer, but please do thinks in the order that
is most logical to you.

Nicholas Clark


type sigils redux, and new unary ^ operator

2005-11-22 Thread Larry Wall
I'm changing my mind about type sigils.  After playing around with ^
for a while, I find it's useful only in signatures and declarations,
and I'm generally forced to omit it when using it within inner
declarations, or it would redeclare the type.  Taking that together
with the fact that it installs a local :: symbol anyway, I think we
can safely go back to the position that the :: sigil in a signature
or declaration captures a parametric type, and otherwise is a no-op.

The problem that worried me (about wanting to refer to a type that
will exist but hasn't been declared yet) does not arise often in 
practice, and can be solved with a symbolic ref in any event, or
by predeclaring a stub type.

What tipped me over the edge, however, is that I want ^$x back for a
unary operator that is short for 0..^$x, that is, the range from 0
to $x - 1.  I kept wanting such an operator in revising S09.  It also
makes it easy to write

for ^5 { say }  # 0, 1, 2, 3, 4

Now, while it's true that ^5 is an illegal type name, a unary operator
takes an expression, and that could start with an alpha: ^rand(5).
We could conceivably keep the type sigil if we forced you to say
instead ^(rand(5)) but that seems like a bad non-orthogonality.

So let's go back to ::T for a parametric type, at least until I change
my mind again.  Sorry if you feel jerked around.

Larry


Re: Perl 6 Summary for 2005-11-14 through 2005-11-21

2005-11-22 Thread Leopold Toetsch


On Nov 22, 2005, at 1:40, Matt Fowles wrote:


   Call Frame Access
Chip began to pontificate about how one should access call frames. 
Chip

suggested using a PMC, but Leo thought that would be too slow.


No, not really. It'll be slower, yes. But my argument was: whenever you 
start introspecting a call frame, by almost whatever means, this will 
keep the call frame alive[1] (see Continuation or Closure). That is: 
timely destruction doesn't work for example and the introspection 
feature is adding another level of complexity that isn't needed per se, 
because 2 other solutions are already there (or at least implemented 
mostly).


leo

[1] a call frame PMC could be stored elsewhere and reused later, 
refering to then dead contents. Autrijus mentioned that this will need 
weak references to work properly.




Re: type sigils redux, and new unary ^ operator

2005-11-22 Thread Rob Kinyon
On 11/22/05, Larry Wall [EMAIL PROTECTED] wrote:
 What tipped me over the edge, however, is that I want ^$x back for a
 unary operator that is short for 0..^$x, that is, the range from 0
 to $x - 1.  I kept wanting such an operator in revising S09.  It also
 makes it easy to write

 for ^5 { say }  # 0, 1, 2, 3, 4

I read this and I'm trying to figure out why P6 needs a unary operator
for something that is an additional character written the more legible
way. To me, ^ indicates XOR, so unary ^ should really be the bit-flip
of the operand. So, ^0 would be -1 (under 2's complement) and ^1 would
be -2. I'm not sure where this would be useful, but that's what comes
to mind when discussing a unary ^.

Thanks,
Rob


Re: Perl 6 Summary for 2005-11-14 through 2005-11-21

2005-11-22 Thread chromatic
On Wed, 2005-11-23 at 01:39 +0100, Leopold Toetsch wrote:

 But my argument was: whenever you 
 start introspecting a call frame, by almost whatever means, this will 
 keep the call frame alive[1] (see Continuation or Closure). That is: 
 timely destruction doesn't work for example...

Destruction or finalization?  That is, if I have a filehandle I really
want to close at the end of a scope but I don't care when GC drags it
into the void, will the close happen even if there's introspection
somewhere?

-- c