Nested captures

2005-05-09 Thread Autrijus Tang
As Pugs now has Rule support via PGE (either with external parrot or a
faster, linked libparrot), I've been playing with the new capturing 
semantics.

Currently, matching 123 against /(.(.(.)))/ produces this:

$0: 123
$1: 123
$1[0]: 23
$1[0][0]: 3

Instead of the Perl 5 behaviour:

$0: 123
$1: 123
$2: 23
$3: 3

Is this correct and intended?  I tried consulting A/S/E05, but can't
find exact wording that defines this.

Thanks,
/Autrijus/


pgp3t2SXW1RSB.pgp
Description: PGP signature


Re: Nested captures

2005-05-09 Thread Carl Franks
Are you subscribed to perl6-compiler?
Yesterday Patrick Michaud posted PGE features update (corrections)
which describes the results you've got:

* Match objects for nested captures are nested into the surrounding
capture object.  Thus, given

  rulesub = p6rule(:w (let) ( (\w+) \:= (\S+) ))
  match = rulesub(let foo := 123)

the outer match object contains two match objects ($/[0] and $/[1]),
and the second of these contains two match objects at
$/[1][0] and $/[1][1].

  print match# outputs let foo := 123
  $P0 = match[0] # first subcapture ($1)
  print $P0  # outputs let
  $P0 = match[1] # second subcapture ($2)
  $P1 = $P0[0]   # first nested capture ($2[0])
  print $P1  # outputs foo
  $P1 = $P0[1]   # second nested capture ($2[1])
  print $P1  # outputs 123

Cheers,
Carl


Re: Nested captures

2005-05-09 Thread Damian Conway
I will be releasing a full description of the new capturing semantics in the 
next day or two. It will be appended to the appropriate Synopsis, but I'll 
also post it here. It may be as soon as tomorrow, but I'm away teaching this 
week, so my time is restricted.

Damian


Re: Nested captures

2005-05-09 Thread Autrijus Tang
On Mon, May 09, 2005 at 12:15:30PM +0100, Carl Franks wrote:
 Are you subscribed to perl6-compiler?

Yes, of course I am. :-)

 Yesterday Patrick Michaud posted PGE features update (corrections)
 which describes the results you've got:

Ahh.  I must've missed it.  Thanks for the pointer.

/me eagerly awaits new revelation from Damian...

Cheers,
/Autrijus/


pgpm7dvA1lb0g.pgp
Description: PGP signature


Re: Nested captures

2005-05-09 Thread Damian Conway
Autrijus wrote:
/me eagerly awaits new revelation from Damian...
Be careful what you wish for. Here's draft zero. ;-)
Note that there may still be bugs in the examples, or even in the design.
@Larry has thrashed this through pretty carefully, and Patrick has implemented 
it for PGE, but it's 10.30 at night after a full day's teaching, so I may have 
transcribed the post-thrashing, post-implementation corrections incorrectly. %-)

Damian
-cut--cut--cut--cut--cut--cut-
=head1 Perl 6 rules capturing semantics
=head2 Match objects
All match attempts--successful or not--against any rule, subrule, or
subpattern (see below) return an object of (or derived from) class
CMatch. That is:
$match_obj = $str ~~ /pattern/;
say Matched if $match_obj;
In any code that is not nested inside a rule, this returned object is
also automagically assigned to the lexical C$/ variable. That is:
$str ~~ /pattern/;
say Matched if $/;
In any code that is nested inside a rule, the C$/ variable holds the
surrounding rule's nascent CMatch object (which can be modified via the
internal C$/. For example:
$str ~~ / foo # Match 'foo'
  { $/ = new Match: :strbar }   # But pretend we matched 'bar'
/;
CMatch objects have methods that provide addition information about
the match. For example:
if m/ def ident codeblock / {
say Found sub def between index $/.from() and index { $/.to()-1 };
}
A CMatch object can also be treated as a boolean, an integer, a
string, an array, or a hash. See below.
=head2 Match results
A failed match returns a CMatch object whose boolean value is false, whose
integer value is zero, whose string value is C, and whose array and hash
components are empty. For example:
bard ~~ /food/;
say Poet inedible unless $/;
A successful match returns a CMatch object whose boolean value is
true, whose integer value is typically 1 (except under the C:g or
C:x flags; see LCapturing from non-singular matches), whose string
value is the complete substring that was matched by the entire rule,
whose array component contains all subpattern (unnamed) captures, and
whose hash component contains all subrule (named) captures. For example:
if ($/) {
$count += $/;
say Matched the substring: $/;
say Parens captured: @{$/};
say 'Subrules captured:';
for %{$/}.kv - $subrule_name, $substr {
say \t$subrule_name: $substr;
}
}
=head2 Subpattern captures
Any part of a rule enclosed in capturing parentheses is called a
Isubpattern. For example:
   #   subpattern
   #  _/\
   # |   |
   # |   subpattern  subpattern  |
   # |  __/\____/\__ |
   # | |  |  |  ||
m:w/ (I am the (walrus), ( khoo )**{2} kachoo) /;
Each subpattern in a rule produces a CMatch object if it is
successfully matched. This object is assigned into the array inside the
CMatch object belonging to the surrounding scope -- either the
CMatch object of the innermost surrounding subpattern (if the
subpattern is nested) or else the CMatch object of the rule itself.
These assignments to the array are, of course, undone if the subpattern
is backtracked out of.
For example, if the following pattern matched successfully:
   #subpat-A
   #  _/\
   # |   |
   # | subpat-B  subpat-C|
   # |  __/\____/\__ |
   # | |  |  |  ||
m:w/ (I am the (walrus), ( khoo )**{2} kachoo) /;
then the CMatch objects representing the matches made by subpat-B and
subpat-C would be successively assigned into the array inside subpat-A's
CMatch object. Then subpat-A's CMatch object would be assigned into the
array inside the CMatch object for the entire rule (i.e. C$/'s array).
The array elements of a CMatch object are referred to using either the
standard array access notation (e.g. C$/[0], C$/[1], C$/[2], etc.)
or else via the corresponding lexically scoped numeric aliases (i.e.
C$1, C$2, C$3, etc.)
So:
say $/[1] found between $/[0] and $/[2];
is the same as:
say $2 found between $1 and $3;
Note that the standard array access notation uses zero-based indices
(0,1,2...), whereas the corresponding numeric variables are
numbered by ordinal position (1,2,3...)
Since the array elements of the rule's CMatch object (i.e. C$/)
store individual CMatch objects representing the substrings that where
matched and captured by the first, second, third, etc. Ioutermost
(i.e. unnested) subpatterns, these elements can be treated like fully
fledged match results. For example:
if m/ (\d\d\d\d)-(\d\d)-(\d\d) (BCE?|AD|CE)?/ {
($yr, $mon, $day) = ($1, $2, $3);   

Re: Fwd: Re: Pugs 6.2.0 released.

2005-05-09 Thread David Landgren
Jonathan Worthington wrote:
Juerd [EMAIL PROTECTED] wrote:
You both use iff. What does that mean?
I believe it's to be read if and only if.
Yes, but that doesn't explain what it means. Rather than me try to 
explain it (poorly)...

http://en.wikipedia.org/wiki/If_and_only_if
David



Re: Fwd: Re: Pugs 6.2.0 released.

2005-05-09 Thread Rob Kinyon
What's really odd is that document links to
http://en.wikipedia.org/wiki/Exclusive_disjunction which ends up
stating that chained xors are associative and commutative, meaning
that instead of acting as one(), it counts parity.

Rob

On 5/9/05, David Landgren [EMAIL PROTECTED] wrote:
 Jonathan Worthington wrote:
  Juerd [EMAIL PROTECTED] wrote:
 
  You both use iff. What does that mean?
 
  I believe it's to be read if and only if.
 
 Yes, but that doesn't explain what it means. Rather than me try to
 explain it (poorly)...
 
 http://en.wikipedia.org/wiki/If_and_only_if
 
 David
 



Re: Nested captures

2005-05-09 Thread Patrick R. Michaud
Here's some more commentary to draft zero of the capturing semantics
(thanks, Damian!), based partially on PGE's current implementation.

On Mon, May 09, 2005 at 10:51:53PM +1000, Damian Conway wrote:
 [...]
 =head2 Nested subpattern captures
 [...]
 There may also be shortcuts for accessing nested components of a subpattern,
 specifically:
 
  # Perl 6...
  #
  # $1-  $2-  $3
  # | $1.1  $1.2-  | |  | | $3.1  $3.2  |
  # | |   | |$1.2.1  | | |  | | |   | |   | |
  # | |   | | |   |  | | |  | | |   | |   | |
 m/ ( The (\S+) (guy|gal|g(\S+)  ) ) (sees|calls) ( the (\S+) (gal|guy) 
 ) /;
 
 but this has not yet been decided.

After thinking on this a bit, I'm hoping we don't do this -- at least not
initially.  I'm not sure there's a lot of advantage of  C $1.1  over 
C $1[0] , and one starts to wonder about things like $1.$j.2 and
$1[$j].2 and the like.  

 =head2 Quantified subpattern captures
 [...]
 If a subpattern is directly quantified using the C? or C?? quantifier,
 it produces a single CMatch object. That object is successful if the
 subpattern did match, and unsuccessful if it was skipped. 

I'm not sure that PGE has these exact semantics for C? yet -- I'll have 
to check.

 =head2 Indirectly quantified subpattern captures
 [...]
 A subpattern may sometimes be nested inside a quantified non-capturing
 structure:
 
  #   non-capturingquantified
  #  __/\_  __/\__
  # | ||  |
  # |   $1 $2 ||  |
  # |  _^_  ___^___   ||  |
  # | |   ||   |  ||  |
 m/ [ (\w+) \: (\w+ \s+)* ]**{2...} /
 
 [...] In Perl 5, any repeated captures of this kind:
 
  # Perl 5 equivalent...
 m/ (?: (\w+) \: (\w+ \s+)* ){2,} /x
 
 would overwrite the previous captures to C$1 and C$2 each time the
 surrrounding non-capturing parens iterated. So C$1 and C$2 would
 contain only the captures from the final repetition.
 
 This does not happen in Perl 6. Any indirectly quantified subpattern is
 treated like a directly quantified subpattern. Specifically, an
 indirectly quantified subpattern also returns an array of CMatch
 objects, so the corresponding array element for the indirectly
 quantified capture will store an array reference, rather than a single
 CMatch object.

It might be worthwhile to add a note here that one can still get
at the results of the final repetition by using $1[-1] and $2[-1].

 =head2 Subpattern numbering
 [...]
 Of course, the leading Cundefs that Perl 5 would produce do convey
 (albeit awkwardly) which alternative actually matched. If that
 information is important, Perl 6 has several far cleaner ways to
 preserve it. For example:
 
 rule alt (Str $n) { {$/ = $n} }
 
 m/ alt tea  (don't) (ray) (me) (for) (solar tea), (d'oh!)
  | alt BEM  (every) (green) (BEM) (devours) (faces)
  /;

If the C alt  rule is accepting a string argument, the match
statement probably needs to read

 m/ alt: tea  (don't) (ray) (me) (for) (solar tea), (d'oh!)
  | alt: BEM  (every) (green) (BEM) (devours) (faces)
  /;


 =head2 Repeated captures of the same subrule
 
 =head3 Scalar aliases applied to quantified constructs
 [...]
 A set of quantified Inon-capturing brackets always returns a
 single CMatch object which contains only the complete substring
 that was matched by the full set of repetitions of the brackets (as
 described in LNamed scalar aliases applied to non-capturing brackets).

At present, PGE isn't working this way -- aliased quantified non-capturing
brackets returns an array of match objects, same as other quantified
structures.  This can be changed, but I kind of like the consistency 
that results -- 

coffee fifo fumble ~~ m/ .*? $effs:=[f -[f]**{1..2} \s*]+ /;

PGE currently gives $effs an array of matches, same as for the
other capturing constructs.  If someone wants to capture the full
set, it's easy enough to do

coffee fifo fumble ~~ m/ .*? $effs:=[ [f -[f]**{1..2} \s*]+ ] /;

and it's pretty clear what was intended.

 =head3 Array aliasing
 =head3 Hash aliasing
 =head3 External aliasing
 =head2 The C:parsetree flag
 etc.

At the moment PGE doesn't support these, and probably won't until
they're actually needed in the course of developing the compiler
(or until someone adds them).

 [...]
 Moreover, the C:parsetree flag overrides the exemption of C «name» 
 subrule calls, so they act as if they were C name  calls instead. They
 generate CMatch objects, and those objects are also appended onto the
 surrounding scope's CMatch array.

Do we still have the C «name»  syntax for rules?  S05 doesn't 
mention it, A05 mentions it as a non-capturing subrule but I think 
we've since changed to C ?name  instead.  If we don't have 
C «name»  I'll adjust S05/A05 accordingly.

Pm


Re: Nested captures

2005-05-09 Thread Paul Seamons
 =item *

 Quantifiers (except C? and C??) cause a matched subrule or subpattern to
 return an array of CMatch objects, instead of just a single object.

What is the effect of the quantifiers C**{0,1} and C**{0,1}? ?  Will they 
behave like ? and ?? and return a single object - or will they cause the 
quantified subrule or subpattern to return as an array of CMatch objects?

Paul


Re: Nested captures

2005-05-09 Thread Patrick R. Michaud
On Mon, May 09, 2005 at 09:47:14AM -0600, Paul Seamons wrote:
  =item *
 
  Quantifiers (except C? and C??) cause a matched subrule or subpattern to
  return an array of CMatch objects, instead of just a single object.
 
 What is the effect of the quantifiers C**{0,1} and C**{0,1}? ?  Will they 
 behave like ? and ?? and return a single object - or will they cause the 
 quantified subrule or subpattern to return as an array of CMatch objects?

First, I much prefer an alternate wording to Damian's:

   The C*, C+, and C**{...} quantifiers all produce an array
   of CMatch objects instead of just a single object.

To answer your question, C**{0..1} always produces an array of 
Match objects (think C**{$m..$n} where $m and $n may not be 
immediately known),  while C? always produces a single Match object.

Both C**{0..1} and C? will match zero or one occurrence of the
thing being quantified, but a non-matching C**{0..1} results in
a zero-length array, while a non-matching C? results in an 
unsuccessful Match object.

Pm


Re: Nested captures

2005-05-09 Thread Larry Wall
On Mon, May 09, 2005 at 09:47:14AM -0600, Paul Seamons wrote:
:  =item *
: 
:  Quantifiers (except C? and C??) cause a matched subrule or subpattern to
:  return an array of CMatch objects, instead of just a single object.
: 
: What is the effect of the quantifiers C**{0,1} and C**{0,1}? ?

That would be **{0..1} and **{0..1}? actually, since we're treating ..
as a real range and {} as a real closure.  (Though they're presumably
optimized away for constant ranges.)

: Will they 
: behave like ? and ?? and return a single object - or will they cause the 
: quantified subrule or subpattern to return as an array of CMatch objects?

The latter.  In the abstract it would be nice to define ? in terms of
**{0..1}, but the simple fact is that we can't afford to let **{$n..$m}
change the form of its return value merely because $n just happens
to be 0 and $m just happens to be 1.  And that actually makes the
distinction between ? and **{0..1} more useful, insofar as it lets you
specify which form of return you want.

Larry


Re: Nested captures

2005-05-09 Thread Larry Wall
On Mon, May 09, 2005 at 10:33:33AM -0500, Patrick R. Michaud wrote:
:  =head2 Subpattern numbering
:  [...]
:  Of course, the leading Cundefs that Perl 5 would produce do convey
:  (albeit awkwardly) which alternative actually matched. If that
:  information is important, Perl 6 has several far cleaner ways to
:  preserve it. For example:
:  
:  rule alt (Str $n) { {$/ = $n} }
:  
:  m/ alt tea  (don't) (ray) (me) (for) (solar tea), (d'oh!)
:   | alt BEM  (every) (green) (BEM) (devours) (faces)
:   /;
: 
: If the C alt  rule is accepting a string argument, the match
: statement probably needs to read
: 
:  m/ alt: tea  (don't) (ray) (me) (for) (solar tea), (d'oh!)
:   | alt: BEM  (every) (green) (BEM) (devours) (faces)
:   /;

This seems like a rather ugly syntax for what is essentially a label,
or a null rule.  I wonder if we can come up with something a little
prettier.  Something like:

 m/ null:tea  (don't) (ray) (me) (for) (solar tea), (d'oh!)
  | null:BEM  (every) (green) (BEM) (devours) (faces)
  /;

 m/ tea:=  (don't) (ray) (me) (for) (solar tea), (d'oh!)
  | BEM:=  (every) (green) (BEM) (devours) (faces)
  /;

 m/ :tea  (don't) (ray) (me) (for) (solar tea), (d'oh!)
  | :BEM  (every) (green) (BEM) (devours) (faces)
  /;

or even plain label syntax:

 m/ tea: (don't) (ray) (me) (for) (solar tea), (d'oh!)
  | BEM:  (every) (green) (BEM) (devours) (faces)
  /;

if we recognize that : makes no sense as a backtrack control on a
non-quantified item.

Larry


Re: Nested captures

2005-05-09 Thread Patrick R. Michaud
On Mon, May 09, 2005 at 11:02:58AM -0500, Patrick R. Michaud wrote:
 On Mon, May 09, 2005 at 09:47:14AM -0600, Paul Seamons wrote:
   =item *
  
   Quantifiers (except C? and C??) cause a matched subrule or subpattern 
   to
   return an array of CMatch objects, instead of just a single object.
  
 First, I much prefer an alternate wording to Damian's:
 
The C*, C+, and C**{...} quantifiers all produce an array
of CMatch objects instead of just a single object.

Perhaps better is:

   The C*, C+, and C**{...}  (but not C? or C??) quantifiers
   cause a matched subrule or subpattern to return an array of CMatch
   objects, instead of just a single object.

And since I've noticed that a lot of people who see this document
end up asking about the relationship between C? and C**{0..1},
perhaps we should just put an explicit note in there somewhere
about it.  For example, at the end of the section we could say
something like:

   Note that the C? and C**{0..1} both mean match zero or one
   occurrence, but C? always produces a single CMatch object
   (which may be an unsuccessful match) and C**{0..1} always
   produces an array of CMatch objects (which will likely be
   empty for an unsuccessful match).

Pm


Re: Nested captures

2005-05-09 Thread Patrick R. Michaud
On Mon, May 09, 2005 at 09:14:02AM -0700, Larry Wall wrote:
 :  m/ alt: tea  (don't) (ray) (me) (for) (solar tea), (d'oh!)
 :   | alt: BEM  (every) (green) (BEM) (devours) (faces)
 :   /;
 
 This seems like a rather ugly syntax for what is essentially a label,
 or a null rule.  I wonder if we can come up with something a little
 prettier.  

I wonder if it's deserving of much in the way of special syntax at all,
given that we have a variety of ways to do it (closures come to mind).  
In the example above, one could just as easily test $1 for don't vs.
every to figure out which alternation matched.  Indeed, a simple answer
is:

 m/ $tea:=null  (don't) (ray) (me) (for) (solar tea), (d'oh!)
  | $bem:=null  (every) (green) (BEM) (devours) (faces)
  /;

and then 

if ($/tea) { say I hate solar tea }
if ($/bem) { say I love bug-eyed monsters }

But from your examples:

  m/ null:tea  (don't) (ray) (me) (for) (solar tea), (d'oh!)
   | null:BEM  (every) (green) (BEM) (devours) (faces)
   /;

Hmm, capturing to $null seems odd.

  m/ tea:=  (don't) (ray) (me) (for) (solar tea), (d'oh!)
   | BEM:=  (every) (green) (BEM) (devours) (faces)
   /;

Please, not this one -- it looks too much like a subrule call to
tea(=) (from A05).

  m/ :tea  (don't) (ray) (me) (for) (solar tea), (d'oh!)
   | :BEM  (every) (green) (BEM) (devours) (faces)
   /;

This one has possibilities.   It looks like a generalization of
pair constructors though, so one could also conceivably do things
like  :tea(0) and :tea('foo').  With that one could then write

 m/ :alt('tea')  (don't) (ray) (me) (for) (solar tea), (d'oh!)
  | :alt('BEM')  (every) (green) (BEM) (devours) (faces)
  /;

and have

given $alt {
when 'tea' { say I hate solar tea }
when 'BEM' { say I love bug-eyed monsters }
}

 or even plain label syntax:
 
  m/ tea: (don't) (ray) (me) (for) (solar tea), (d'oh!)
   | BEM:  (every) (green) (BEM) (devours) (faces)
   /;
 
 if we recognize that : makes no sense as a backtrack control on a
 non-quantified item.

This sounds too special-case to me.  Also, I think it does make
sense to backtrack control on non-quantified subrules and subpatterns,
so we'd have to say that : has this meaning only after a non-quantified
literal.  I feel there are too many other good ways to do it to
add this one.

Pm


Zero-day rules implementation status in Pugs

2005-05-09 Thread Autrijus Tang
On Mon, May 09, 2005 at 10:51:53PM +1000, Damian Conway wrote:
 Autrijus wrote:
 
 /me eagerly awaits new revelation from Damian...
 
 Be careful what you wish for. Here's draft zero. ;-)

...and here is my status report of the Zero-Day exploit, err,
implementation, in Pugs. :-)

Note that the output from Pugs's interactive command line is in
.perl format.  Patrick++ for designing a nicely designed PGE API!

 =head2 Match objects

~~ returns match object:

pugs true(1 ~~ /1/)
bool::true
pugs true(1 ~~ /2/)
bool::false

$/ gets assigned:

pugs 1 ~~ /1/; true $/
bool::true
pugs 1 ~~ /2/; true $/
bool::false

Match object can take .from, .to and .matches methods:

pugs 1 ~~ /1/; ($/.from, $/.to, $/.matches)
(0, 1, ())

 =head2 Match results

Failed match:

pugs bard ~~ /food/; (?$/, +$/, ~$/)
(bool::false, 0.0, '')

Successful match:

pugs bard ~~ /(ba) $x := (rd)/; (?$/, +$/, ~$/, ~$/[0], ~$/x)
(bool::true, 1.0, 'bard', 'ba', 'rd')

 =head2 Subpattern captures

Positional captures:

pugs 42 ~~ /(.)/; (~$/[0], ~$1, ~$/1, ~$1)
('4', '4', '4', '4')

Optional captures:

pugs 42 ~~ /(50)?/; (?$/[0], ?$1, ?$/1, ?$1)
(bool::false, bool::false, bool::false, bool::false)

 =head2 Nested subpattern captures

pugs 42 ~~ /(.(.))/; ~$/[0][0]
'2'

 =head2 Quantified subpattern captures

pugs 42 ~~ /(.)*/; ~$/[0][0]
'4'

 =head2 Indirectly quantified subpattern captures

pugs 42 ~~ /[4 (.)]*/; ~$/[0][0]
'2'

 =head2 Subpattern numbering

pugs 42 ~~ /(50) | (42)/; ~$1
'42'

 =head2 Subrule captures

This should work -- but PGE does not yet have any built-in subrules,
and only the ident is given in examples.  Is there a list for all
builtin (global) subrules?

 =head2 Aliasing

pugs 42 ~~ /$match := (..)/; ~$match
'42'

pugs 42 ~~ /$99 := (..)/; ~$99
'42'

Thanks,
/Autrijus/


pgp0BjLwFluc7.pgp
Description: PGP signature


Re: Nested captures

2005-05-09 Thread Uri Guttman
 PRM == Patrick R Michaud [EMAIL PROTECTED] writes:

  PRM After thinking on this a bit, I'm hoping we don't do this -- at least not
  PRM initially.  I'm not sure there's a lot of advantage of  C $1.1  over 
  PRM C $1[0] , and one starts to wonder about things like $1.$j.2 and
  PRM $1[$j].2 and the like.  

i would say that you can use .1 only when all the indexes are literals
like $1.2.1. anything else must be a proper index expression on $1 like
$1[$j][1].

mixing those would scare me more than anything and it isn't much of a
hardship to use the full expression form when you need it. also in perl,
indexing isn't used nearly as often as for loops, so if you did grab
something that was in a array in some match value, you would more likely
loop over it than index into it. so again, the hardship of the index
syntax isn't a big deal as it should be rarely needed.

just my $1.02. :)

uri

-- 
Uri Guttman  --  [EMAIL PROTECTED]   http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs    http://jobs.perl.org


Re: stdio

2005-05-09 Thread Aaron Sherman
Nothing makes you re-think your reply length like having your mailer
lose your message ;-)

A lot of your message revolves around this idea that there's a normal
file open semantic. What I've tried (but clearly failed) to articulate
previously is that this notion is becoming archaic in what is shaping up
to be the post-POSIX world. Gnome VFS and the .NET Framework are good
examples of this.

Now, I don't think that the state of the world is yet such that we can
say that Perl should embrace the Gnome VFS or .NET Framework or whatever
OS/X does or any other meta-file-framework on people, but I do think
that we can safely expect Perl 6 to have to deal with these concepts and
would be well served by building in a standard way to add your More Than
One Way later on through CPAN.

On Fri, 2005-05-06 at 15:10, Larry Wall wrote:
 On Fri, May 06, 2005 at 08:19:05AM -0400, Aaron Sherman wrote:
 : open as a verb is extremely ambiguous. In dictionary searches I see as
 : many as 19 definitions just for the verb form.
 
 Well, sure, but also need to take Perl history into account, where dwimmy
 open is considered something of a liability.

I've come across many folks (and I'm one of them) who a) like the -
magic a lot and b) are in the majority in my experience. I definitely
see the concern around adverbial bits showing up in the text (e.g. 
and ), but not magical filenames like -. Three-argument open is a
godsend, but I'd love to preserve the bits that were useful in Perl 6.

 I think the dwimminess of open() probably arises only from MMD, and
 a string or array of string in the first argument implies ordinary
 file open.  That means perhaps we have
 
 open uri($x)

Ok, I had a long reply to this, but I'll sum up:

  * open uri($x) implies that I know that $x is a URI.
  * Being able to use IO::URI :stropen makes an otherwise
cumbersome chore into a breeze, e.g.:
use IO::URI :stropen;
use IO::SomethingElse :stropen;
GetOptions('f|file' = \$input_file);
my $input = open($input_file, :r);
  * I don't think any of this should be on by default, with the
possible exception of -, but that's only possible.

 : 
 : sub File::Copy::copy(IO $file1 :r, IO $file2 :w) {...}

I think you took this as a request for long, ::-separated names in
everyday code. I was just being verbose for one-line clarity, not
because I think subdefs should look like that in real code. To be more
specific:

module File::Copy is exportcopy {
sub copy(IO $file1 :r, IO $file2 :w) {...}
}

And I was asking if that would pass the adverb down to the constructor
for IO like so:

$file1 = IO.new($param, :r)

and if not, how one can do that.

 Those are all pretty bletcherous.  How 'bout
 
 io('-') == io('-');

In Perl 5, File::Copy::copy is not a pipelineable (or potentially lazy)
operation because it can be implemented by OS-level special-purpose
functions. I think that overloading infix:==(IO $a,IO $b) to behave
this way would be potentially very misleading to the programmer, and
thus should probably be left alone.

Should file copying be a core function or a module as it was in Perl 5?
I don't know, but either way I think it needs to continue to make OS
specific copying available for the six or seven platforms that Perl 5
currently knows about.

[...example code...]
 : I would need some error handling here, and possibly would need to defer
 : to a parent as a fallback.

Sure, that all makes sense. I was white-boarding, and haven't even yet
solved the problem of scoping the change.

 : That brings up the idea of delegation... should this be handled by
 : delegation instead of the way I've done it? Not sure. I'm still trying
 : to figure out how to make this scope correctly so that:

Delegation might work very well here. There was a nice use for it that I
had in mind for IO::Pipe as well. I have to re-read the array delegation
rules to see how that behaves, but thanks for reminding me of it.


-- 
Aaron Sherman [EMAIL PROTECTED]
Senior Systems Engineer and Toolsmith
It's the sound of a satellite saying, 'get me down!' -Shriekback




Re: Nested captures

2005-05-09 Thread Larry Wall
On Mon, May 09, 2005 at 10:33:33AM -0500, Patrick R. Michaud wrote:
: After thinking on this a bit, I'm hoping we don't do this -- at least not
: initially.  I'm not sure there's a lot of advantage of  C $1.1  over 
: C $1[0] , and one starts to wonder about things like $1.$j.2 and
: $1[$j].2 and the like.  

Or maybe it should generalize the other direction.  We just got through
the great bracket shift to make $xa mean $x{'a'} so that we can recognize
constant hash subscripts easily.  Maybe $x.1 is just the numeric analog
of that.  And $x.$j.2 could just fall out of that, where the indirect
method dispatcher knows to turn a numeric method name into a subscript.

Larry


Re: Nested captures

2005-05-09 Thread Patrick R. Michaud
On Mon, May 09, 2005 at 11:34:10AM -0700, Larry Wall wrote:
 On Mon, May 09, 2005 at 10:33:33AM -0500, Patrick R. Michaud wrote:
 : After thinking on this a bit, I'm hoping we don't do this -- at least not
 : initially.  I'm not sure there's a lot of advantage of  C $1.1  over 
 : C $1[0] , and one starts to wonder about things like $1.$j.2 and
 : $1[$j].2 and the like.  
 
 Or maybe it should generalize the other direction.  We just got through
 the great bracket shift to make $xa mean $x{'a'} so that we can recognize
 constant hash subscripts easily.  Maybe $x.1 is just the numeric analog
 of that.  And $x.$j.2 could just fall out of that, where the indirect
 method dispatcher knows to turn a numeric method name into a subscript.

Hmmm, then would $x.$j.2 then be equivalent to $x[$j-1][1] ?  

Pm


Re: Nested captures

2005-05-09 Thread Larry Wall
On Mon, May 09, 2005 at 02:08:31PM -0500, Patrick R. Michaud wrote:
: Hmmm, then would $x.$j.2 then be equivalent to $x[$j-1][1] ?  

Ouch.

Larry


Re: Nested captures

2005-05-09 Thread Larry Wall
On Mon, May 09, 2005 at 12:14:35PM -0700, Larry Wall wrote:
: On Mon, May 09, 2005 at 02:08:31PM -0500, Patrick R. Michaud wrote:
: : Hmmm, then would $x.$j.2 then be equivalent to $x[$j-1][1] ?  
: 
: Ouch.

Maybe that's a good reason to switch from 1-based to 0-based
$digit vars.  Not sure what that would do to the current $0 though.
Most of the time $/ can stand in for it, I guess, though s/.../$//
is visually problematic.  We could maybe resurrect $.

Larry


Re: Nested captures

2005-05-09 Thread Uri Guttman
 LW == Larry Wall [EMAIL PROTECTED] writes:

  LW On Mon, May 09, 2005 at 12:14:35PM -0700, Larry Wall wrote:
  LW : On Mon, May 09, 2005 at 02:08:31PM -0500, Patrick R. Michaud wrote:
  LW : : Hmmm, then would $x.$j.2 then be equivalent to $x[$j-1][1] ?  
  LW : 
  LW : Ouch.

  LW Maybe that's a good reason to switch from 1-based to 0-based
  LW $digit vars.  Not sure what that would do to the current $0 though.
  LW Most of the time $/ can stand in for it, I guess, though s/.../$//
  LW is visually problematic.  We could maybe resurrect $.

or do what i mentioned, not allow mixing of the two styles of match
access. i don't see any real win for mixing them. indexing into matched
arrays will not be so common to deserve conflating the 0 and 1 based
indexing as well as the notations. leave it with $1.1 and $1[0] as being
the two styles. you must use literal integers with the former and it is
1 based. you can use any expressions with the latter and it is 0
based. by allowing $1[$j].1 you save only 1 char over $1[$j][0] and
would cause major confusion IMO.

uri

-- 
Uri Guttman  --  [EMAIL PROTECTED]   http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs    http://jobs.perl.org


Re: Nested captures

2005-05-09 Thread mark . a . biggar
Can I say $*1, $*2, etc, to get perl5 flattened peren counting captures?  We 
need something like that to make perl5-perl6 translation easier; otherwise 
we'd have to parse perl5 RE instead of just slapping on a :p5.   Unless :p5 
also means that you get a single already fattened match objct.

--
Mark Biggar
[EMAIL PROTECTED]
[EMAIL PROTECTED]
[EMAIL PROTECTED]


 On Mon, May 09, 2005 at 02:08:31PM -0500, Patrick R. Michaud wrote:
 : Hmmm, then would $x.$j.2 then be equivalent to $x[$j-1][1] ?  
 
 Ouch.
 
 Larry


Re: Nested captures

2005-05-09 Thread Patrick R. Michaud
On Mon, May 09, 2005 at 08:43:39PM +, [EMAIL PROTECTED] wrote:
 Can I say $*1, $*2, etc, to get perl5 flattened peren counting captures?  
 We need something like that to make perl5-perl6 translation easier; 
 otherwise we'd have to parse perl5 RE instead of just slapping on a :p5.
 Unless :p5 also means that you get a single already fattened match objct.

PGE will have a perl5 RE parser built in to handle the :p5
option, and it will return match objects according to perl 5's
capture semantics with no nesting (i.e., counting left parens).

I will probably start a perl 5 RE parser just to get it started, but 
after that I'd prefer to turn it over to someone else to maintain.  
Note that PGE will already provide the matching engine itself -- 
we simply need something that converts p5 expression trees into 
PGE's expression trees.

In addition, I'm planning to write a glob/wildcard parser, which does
matches based on Unix filename globbing syntax.

Pm


Re: Nested captures

2005-05-09 Thread Uri Guttman
 DC == Damian Conway [EMAIL PROTECTED] writes:


  DC  grammar Shell::Commands {

  DC  my $lastcmd;

  DC  rule cmd { $/:=mv | $/:=cp }

  DC  rule mv { $lastcmd:=(mv)  $files:=[ ident ]+  
$dir:=ident }
  DC  rule cp { $lastcmd:=(cp)  $files:=[ ident ]+  
$dir:=ident }

  DC  sub lastcmd { return $lastcmd }
  DC  }

  DC  while shift ~~ m/Shell::Commands.cmd/ {
  DC  say From: @{$files};
  DC  say   To: $dir;
  DC  }

since files and dirs are internal aliases (their names are in ),
shouldn't those match accesses be $/files and $/dir?

uri

-- 
Uri Guttman  --  [EMAIL PROTECTED]   http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs    http://jobs.perl.org