Re: Hypothetical synonyms

2002-08-29 Thread Larry Wall

Don't forget you can parameterize rules with subrules.  I don't see
any reason you couldn't write a



kind of rule and do whatever you like with the submatched bits.

Larry




Re: Hypothetical synonyms

2002-08-29 Thread Janek Schleicher

Luke Palmer wrote at Thu, 29 Aug 2002 15:21:57 +0200:

>> The ° character doesn't have any special meaning,
>> that's why I choosed it in the above example.
>> However, it also symbolizes a little capturing
>> and as it isn't filled,
>> it could really symbolize an uncapturing.
> 
> Interesting idea.  I'm not sure if I agree with it yet.  However, I don't 
> agree with your syntax, as I can't type that character.  

Year, that's of course a problem.
But I don't have any imagination what over typeable character
with no other meaning could be choosen.

> Is it possible to 
> modify what was captured?
> 
>   /" ([ \\ . { chop; chop } | <[^\\]> ]*?) "/
> 
> Or is that just too ugly?

IMHO, that looks as ugly as the other workaround solutions :-)

I think, the greatest strength of Perl is that
it expresses simple things in a simple, short and natural way.

Such a regexp behaviour would simplify a lot of
jobs where we have to make workarounds instead
about the simple stuff
  "Match it, capture the relevant parts and ignore some irrelevant subparts".

It's always possible to implemented with
- more captures, joined together later
or
- a substitution regexp/translitariton for the captured part
  to remove the irrelevant subparts


It's from my IMHO comparable to problem
  "Group it, but don't capture it"
what had been solved with the (?:) sytnax.
>From that regarding,
a (?_...) (Questionmark underscore) syntax could also be an idea
with the meaning
  "Group it, don't capture it even not in surrounding captures".
With it,
the OP problem would look like:
/\s*((?_").*?"(?_°)|\S+)/;

(I choosed the underscore, as it is typeable and could have the mnemonic meaning
 of some underlying unimport background group)


But perhaps, I'm only dreaming 


Cheerio,
Janek




Re: Hypothetical synonyms

2002-08-29 Thread Luke Palmer

> The ° character doesn't have any special meaning,
> that's why I choosed it in the above example.
> However, it also symbolizes a little capturing
> and as it isn't filled,
> it could really symbolize an uncapturing.

Interesting idea.  I'm not sure if I agree with it yet.  However, I don't 
agree with your syntax, as I can't type that character.  Is it possible to 
modify what was captured?

/" ([ \\ . { chop; chop } | <[^\\]> ]*?) "/

Or is that just too ugly?

Luke




Re: Capturing alternations (was Re: Hypothetical synonyms)

2002-08-28 Thread Damian Conway

Piers wrote:


> Not exactly DWIM, but how about:
> 
>   my $stuff = /^\s* [ "(.*?)" | (\S+) ] : { $foo := $+ }/;
> 
> Assuming $+ means 'the last capture group matched' as it does now.
>

Or just:

 my $stuff = /^\s* [ "$foo:=(.*?)" | $foo:=(\S+) ]/;

BTW, that doesn't actually *do* the match. It merely puts a reference
to a rule object into $stuff.

Perhaps we all actually meant variants on:

 my $stuff = m/^\s* [ "$0:=(.*?)" | $0:=(\S+) ]/;

???

Damian







Capturing alternations (was Re: Hypothetical synonyms)

2002-08-28 Thread Trey Harris

In a message dated Thu, 29 Aug 2002, Janek Schleicher writes:

> Aaron Sherman wrote at Wed, 28 Aug 2002 00:34:15 +0200:
>
> > $stuff = (defined($1)?$1:$2) if /^\s*(?:"(.*?)"|(\S+))/;
>
> It gives me the idea of a missing feature:
>
> What really should be expressed is:
>
> my ($stuff) = /^\s*("°.*?"°|\S+)/;
>
> where the ° character would mean,
> "Don't capture the previous element".

Hmm.  One thing that has always bothered me about regexes is capturing
parentheses in alternations.  It seems to me that:

my ($stuff) = /^\s* [ "(.*?)" | (\S+) ]/;

should DWIM somehow, since it's impossible that both parens will capture.
So when the same number of capturing parens appear in each of an
alternation, they should factor out to being a single return value.

Is this possible in the general case?

Trey




Re: Hypothetical synonyms

2002-08-28 Thread Janek Schleicher

Aaron Sherman wrote at Wed, 28 Aug 2002 00:34:15 +0200:

> $stuff = (defined($1)?$1:$2) if /^\s*(?:"(.*?)"|(\S+))/;

It gives me the idea of a missing feature:

What really should be expressed is:

my ($stuff) = /^\s*("°.*?"°|\S+)/;

where the ° character would mean,
"Don't capture the previous element".

I think that such a meaning of "uncapturing" elements
from a regexp would be really nice,
as it would help to express things directly,
instead of going complicated ways.

The ° character doesn't have any special meaning,
that's why I choosed it in the above example.
However, it also symbolizes a little capturing
and as it isn't filled,
it could really symbolize an uncapturing.

I don't know how hard it would be to implement or
whether it had already discussed yet.


Greetings,
Janek




Re: Hypothetical synonyms

2002-08-28 Thread Luke Palmer

On Thu, 29 Aug 2002, Steffen Mueller wrote:

> Nicholas Clark wrote:
> > On Thu, Aug 29, 2002 at 12:00:55AM +0300, Markus Laire wrote:
> >> And I'm definitely going to try any future PerlGolf challenges also
> >> in perl6.
> >
> > Is it considered better if perl6 use more characters than perl5? (ie
> > implying probably less line noise)
> > or less (getting your job done more tersely?)
> 
> >From the bit of Perl6 information I've gathered from the Apocalypses, the
> Exegesises (is that really the plural? Sounds horrible.), and my

  Exegeses (like parentheses)

> perl6-language reading, I'd say Perl6 is not only going to be a bit more
> verbose (unless you use the dreaded "use Perl5;" pragma ;) ), but it'll also
> be a Good Thing.

No, not nessecarily.  If you do a line-by-line translation, yes.  But the 
fact is, Perl 6 will be able to do more in a single line (cleanly) than 
Perl 5.  For instance, hyper-operators.  So, Perl 6 will contain less 
line-noise and more whitespace than Perl 5, but code will end up being 
shorter, too.  You can see that in Exegesis 4 (or 3, not sure), where Damian 
takes Perl5ish Perl6 code, and then writes it back out in idiomatic Perl 
6. You see how much shorter it becomes.

Luke




Re: Hypothetical synonyms

2002-08-28 Thread Sean O'Rourke

On Thu, 29 Aug 2002, Markus Laire wrote:
> (only 32bit numbers, modulo not fully working, no capturing regexps,
> )

Where does modulo break?

/s




Re: Hypothetical synonyms

2002-08-28 Thread Steffen Mueller

Nicholas Clark wrote:
> On Thu, Aug 29, 2002 at 12:00:55AM +0300, Markus Laire wrote:
>> And I'm definitely going to try any future PerlGolf challenges also
>> in perl6.
>
> Is it considered better if perl6 use more characters than perl5? (ie
> implying probably less line noise)
> or less (getting your job done more tersely?)

>From the bit of Perl6 information I've gathered from the Apocalypses, the
Exegesises (is that really the plural? Sounds horrible.), and my
perl6-language reading, I'd say Perl6 is not only going to be a bit more
verbose (unless you use the dreaded "use Perl5;" pragma ;) ), but it'll also
be a Good Thing.

Applying that to Perl Golf, however, isn't possible. It doesn't make sense
to ask whether less line noise is better in golf. Anybody who has seen any
of the winning solutions should realize that whoever wrote that either used
some random string generator or tried to do create ASCII art from a color
scan of bird droppings.

Maybe I am just a bit frustrated that I had such a hard time understanding
some of the solutions. :)

> It would be interesting to see whether there are classes of problems
> that go in different directions.

I guess over 90 percent of problems will be longer; possibly about 60
percent being significantly longer. (Mainly because of the changes of A5.)

Steffen
--
@n=(544290696690,305106661574,116357),$b=16,@c=' ,JPacehklnorstu'=~
/./g;for$n(@n){map{$h=int$n/$b**$_;$n-=$b**$_*$h;$c[@c]=$h}c(0..9);
push@p,map{$c[$_]}@c[c($b..$#c)];$#c=$b-1}print@p;sub'c{reverse @_}




Re: Hypothetical synonyms

2002-08-28 Thread Nicholas Clark

On Thu, Aug 29, 2002 at 12:00:55AM +0300, Markus Laire wrote:
> And I'm definitely going to try any future PerlGolf challenges also 
> in perl6.

Is it considered better if perl6 use more characters than perl5? (ie
implying probably less line noise)
or less (getting your job done more tersely?)

It would be interesting to see whether there are classes of problems that
go in different directions.

Nicholas Clark
-- 
Even better than the real thing:http://nms-cgi.sourceforge.net/



Re: Hypothetical synonyms

2002-08-28 Thread Nicholas Clark

On Tue, Aug 27, 2002 at 08:59:09PM -0400, Uri Guttman wrote:
> > "LW" == Larry Wall <[EMAIL PROTECTED]> writes:
> 
>   LW> On 27 Aug 2002, Uri Guttman wrote: : and quoteline might even
>   LW> default to " for its delim which would make : that line:
>   LW> : 
>   LW> : my ($fields) = /(|\S+)/;
> 
>   LW> That just looks like:
> 
>   LW> my $field = //;

> and it would be nice to have a dictionary of builtin rules. :)

my $data = //;

It would make 1 liners very powerful.

How long before someone writes that and ships it with parrot?

And the $64,000 question - will the perl regexp engine be faster than
calling expat? Or will they be the same (because the regexp compiler has
certain builtin rules that are actually implemented as calls to C code
(unless they are over-ridden))?

Nicholas Clark
-- 
Even better than the real thing:http://nms-cgi.sourceforge.net/



Re: Hypothetical synonyms

2002-08-28 Thread Markus Laire

On 28 Aug 2002 at 16:04, Steffen Mueller wrote:

> Piers Cawley wrote:
> > Uri Guttman <[EMAIL PROTECTED]> writes:
> >> ... regex code ...
> >
> > Hmm... is this the first Perl 6 golf post?
> 
> Well, no, for two reasons:
> a) There's whitespace.
> b) The time's not quite ready for Perl6 golf because Larry's the only one
> who would qualify as a referee.

I think that time is just right for starting to golf in perl6. Parrot 
with languages/perl6 already supports a working subset of perl6.

I'm currently trying to get factorial-problem from last Perl Golf 
working in perl6, and it has proven to be quite a challenge... 
(only 32bit numbers, modulo not fully working, no capturing regexps, 
)

And I'm definitely going to try any future PerlGolf challenges also 
in perl6.

-- 
Markus Laire 'malaire' <[EMAIL PROTECTED]>





Re: Hypothetical synonyms

2002-08-28 Thread Steffen Mueller

Piers Cawley wrote:
> Uri Guttman <[EMAIL PROTECTED]> writes:
{...]
>> couldn't that be reduced to:
>>
>> m{^\s* $stuff := [ "(.*?)" | (\S+) ] };
>>
>> the | will only return one of the grabbed chunks and the result of
>> the [] group would be assigned to $stuff.
>
> Hmm... is this the first Perl 6 golf post?

Well, no, for two reasons:
a) There's whitespace.
b) The time's not quite ready for Perl6 golf because Larry's the only one
who would qualify as a referee.

And we all know that's not a recreational task :)

Steffen
--
@n=(544290696690,305106661574,116357),$b=16,@c=' ,JPacehklnorstu'=~
/./g;for$n(@n){map{$h=int$n/$b**$_;$n-=$b**$_*$h;$c[@c]=$h}c(0..9);
push@p,map{$c[$_]}@c[c($b..$#c)];$#c=$b-1}print@p;sub'c{reverse @_}




Re: Hypothetical synonyms

2002-08-28 Thread Trey Harris

In a message dated 28 Aug 2002, Aaron Sherman writes:
> Ok, just to be certain:
>
>   $_ = "0";
>   my $zilch = /0/ || 1;
>
> Is $zilch C<"0"> or 8?

8?  How do you get 8?  You'd get a result object which stringified was "0"
and booleanfied was true.  So here, you'd get a result object vaguely
isomorphic to "0 but true".

> If C<"0">, does it continue to be "true"? What about:
>
>   $_ = "0";
>   my $zilch = /0/ || 1;
>   die "Failed to match zero" unless $zilch;
>
> Is that a bug?

Yes, it's a bug, as I don't see any way to actually die there.  I don't
understand the presence of the C<|| 1> there.  I think you'd just write
C.  If you really truly wanted it to be one if it
failed, but you still wanted the die to work, you'd write:

  $_ = "0";
  my $zilch = /0/ || 1 but false;
  die "Failed to match zero" unless $zilch;

Or, more comprehensibly, just

  $_ = "0";
  my $zilch = /0/
   or die "Failed to match zero";

Trey




Re: Hypothetical synonyms

2002-08-28 Thread Aaron Sherman

On Wed, 2002-08-28 at 03:23, Trey Harris wrote:

> Note--no parens around $field.  We're not "capturing" here, not in the
> Perl 5 sense, anyway.
> 
> When a pattern consisting of only a named rule invokation (possibly
> quantified) matches, it returns the result object, which in boolean
> context returns true, but in string context returns the entire captured
> text from the named rule (so, one hopes that the C rule
> captures only the quoted text, not the quotes surrounding it).

Ok, just to be certain:

$_ = "0";
my $zilch = /0/ || 1;

Is $zilch C<"0"> or 8?

If C<"0">, does it continue to be "true"? What about:

$_ = "0";
my $zilch = /0/ || 1;
die "Failed to match zero" unless $zilch;

Is that a bug?





Re: Hypothetical synonyms

2002-08-28 Thread Trey Harris

In a message dated 27 Aug 2002, Uri Guttman writes:

> > "LW" == Larry Wall <[EMAIL PROTECTED]> writes:
>
>   LW> On 27 Aug 2002, Uri Guttman wrote: : and quoteline might even
>   LW> default to " for its delim which would make : that line:
>   LW> :
>   LW> : my ($fields) = /(|\S+)/;
>
>   LW> That just looks like:
>
>   LW> my $field = //;
>
> where is the grabbing there? if there was more than just shellword would
> you have to () it for a grab? wouldn't that assign a boolean like perl5
> or is the boolean result only returned in a boolean context?

Note--no parens around $field.  We're not "capturing" here, not in the
Perl 5 sense, anyway.

When a pattern consisting of only a named rule invokation (possibly
quantified) matches, it returns the result object, which in boolean
context returns true, but in string context returns the entire captured
text from the named rule (so, one hopes that the C rule
captures only the quoted text, not the quotes surrounding it).

I think this is more generalizable.  I believe that if one matches an
arbitrary rule which does not contain capturing parentheses, it returns
the result object as well, which should contain the entire match (as if
one put parens around the entire thing).  Correct?

So:

   my $vers = _ / 6/;

should cause $vers to contain either "6" or "".  A successful match object
is true in boolean context, so

   my $vers = / \d/;

would cause $vers to be true, even if the digit matched was zero.

Here's an interesting one:

   my $vers = _ / \d/; # Stringify...
   print "yes!" if $vers;  # ... and booleanize

If $vers contained "0", would it still be true?  That is, does the "is
true" property of the result object survive stringification?  It might be
useful if it did.  On the other hand, of course, one can also imagine:

   my $flag = _ (/ <[01]>/
 or die "No debug setting!");
   print "yes!" if $flag;

where one would want the truth value to follow old conventions.  Perhaps
you could write:

   my $flag = /
   [ 0 :: { $0 is false }
   | 1
   ]/;

But then you have no way short of another string comparison for teasing
out the difference between a failed match and a zero match, which is what
we were trying to get away from.

Maybe I'm just making this too complicated

> what happens to $field if no match was found? undef? the old boolean
> false of a null string wouldn't be good as that could be the result of a
> match. i assume undef could never be the result of a match unless some
> included perl code returned undef to the match object. then coder emptor
> would be the rule.

If the pattern doesn't match... will it return the undefined value, or
will it return a false (and stringwise empty) result object?  I could see
it going either way, but a failed pattern result object is fairly useless,
isn't it?

> this is gonna make all the groups that copied perl5 regexes blow their
> lids. just think about all the neat canned regexes that will be
> done. like Regex::Common but even more so. we will need a CPAN just for
> these alone. full blown *ML parsers, email verifiers, formatted data
> extractors, etc.

More and more lately, I've been finding myself getting syntax errors when
I've wishfully put Perl 6 into my code. :-)

Trey




Re: Hypothetical synonyms

2002-08-27 Thread Uri Guttman

> "LW" == Larry Wall <[EMAIL PROTECTED]> writes:

  LW> On 27 Aug 2002, Uri Guttman wrote: : and quoteline might even
  LW> default to " for its delim which would make : that line:
  LW> : 
  LW> : my ($fields) = /(|\S+)/;

  LW> That just looks like:

  LW> my $field = //;

where is the grabbing there? if there was more than just shellword would
you have to () it for a grab? wouldn't that assign a boolean like perl5
or is the boolean result only returned in a boolean context?

what happens to $field if no match was found? undef? the old boolean
false of a null string wouldn't be good as that could be the result of a
match. i assume undef could never be the result of a match unless some
included perl code returned undef to the match object. then coder emptor
would be the rule.

and it would be nice to have a dictionary of builtin rules. :)

also i assume i was correct in that we won't need CORE:: for those?
unless something we inherit had the same name and we wanted the CORE::
version.

this is gonna make all the groups that copied perl5 regexes blow their
lids. just think about all the neat canned regexes that will be
done. like Regex::Common but even more so. we will need a CPAN just for
these alone. full blown *ML parsers, email verifiers, formatted data
extractors, etc. 

uri

-- 
Uri Guttman  --  [EMAIL PROTECTED]   http://www.stemsystems.com
- Stem and Perl Development, Systems Architecture, Design and Coding 
Search or Offer Perl Jobs    http://jobs.perl.org



Re: Hypothetical synonyms

2002-08-27 Thread Larry Wall

On 27 Aug 2002, Uri Guttman wrote:
: and quoteline might even default to " for its delim which would make
: that line:
: 
:   my ($fields) = /(|\S+)/;

That just looks like:

my $field = //;

Larry




Re: Hypothetical synonyms

2002-08-27 Thread Larry Wall

On 27 Aug 2002, Uri Guttman wrote:
: > "LW" == Larry Wall <[EMAIL PROTECTED]> writes:
:   LW> m{^\s*[
:   LW> "$stuff:=(.*?)" |
:   LW>  $stuff:=(\S+)
:   LW> ]};
: 
: couldn't that be reduced to:
: 
: m{^\s* $stuff := [ "(.*?)" | (\S+) ] };
: 
: the | will only return one of the grabbed chunks and the result of the
: [] group would be assigned to $stuff.

That too.

Larry




Re: Hypothetical synonyms

2002-08-27 Thread Uri Guttman

> "TH" == Trey Harris <[EMAIL PROTECTED]> writes:

  TH> In a message dated 27 Aug 2002, Uri Guttman writes:
  >> m{^\s* $stuff := [ "(.*?)" | (\S+) ] };

  TH> Or, how about

  TH>   my ($fields) = /( '"')>|\S+)/;

wouldn't quotelike automatically be inherited from the CORE:: rules like
UNIVERSAL is? i have seen  and others mentioned as not being
hardwired builtins but just rules declared elsewhere and inherited. 

and quoteline might even default to " for its delim which would make
that line:

my ($fields) = /(|\S+)/;

uri

-- 
Uri Guttman  --  [EMAIL PROTECTED]   http://www.stemsystems.com
- Stem and Perl Development, Systems Architecture, Design and Coding 
Search or Offer Perl Jobs    http://jobs.perl.org



Re: Hypothetical synonyms

2002-08-27 Thread Trey Harris

In a message dated 27 Aug 2002, Uri Guttman writes:
> m{^\s* $stuff := [ "(.*?)" | (\S+) ] };

Or, how about

  my ($fields) = /( '"')>|\S+)/;

? :-)

Trey




Re: Hypothetical synonyms

2002-08-27 Thread Uri Guttman

> "LW" == Larry Wall <[EMAIL PROTECTED]> writes:


  LW> That seems like a lot of extra work.  I'd prefer to see something like:

  LW> my stuff;

  LW> m{^\s*[
  LW>   "$stuff:=(.*?)" |
  LW>$stuff:=(\S+)
  LW> ]};

couldn't that be reduced to:

m{^\s* $stuff := [ "(.*?)" | (\S+) ] };

the | will only return one of the grabbed chunks and the result of the
[] group would be assigned to $stuff.

uri

-- 
Uri Guttman  --  [EMAIL PROTECTED]   http://www.stemsystems.com
- Stem and Perl Development, Systems Architecture, Design and Coding 
Search or Offer Perl Jobs    http://jobs.perl.org



Re: Hypothetical synonyms

2002-08-27 Thread Larry Wall

On 27 Aug 2002, Aaron Sherman wrote:
: I just wrote this code in Perl5:
: 
: $stuff = (defined($1)?$1:$2) if /^\s*(?:"(.*?)"|(\S+))/;
: 
: This is a common practice for me when I parse configuration and data
: files whose formats I define. It's nice to be able to quote fields that
: have spaces, and this is an easy way to parse the result.
: 
: In Perl6, it looks like what I would like here is very close, but I'm
: not sure. Certainly, I could do:
: 
: $stuff = ($1 // $2) if m{^\s*["(.*?)"|(\S+)]};
: 
: But I would far prefer:
: 
: $stuff = $field if m{^\s*[
:   "(.*?)" {let $field=$1} |
:   (\S+)   {let $field=$2}]};
: 
: even though it's longer.

That seems like a lot of extra work.  I'd prefer to see something like:

my stuff;

m{^\s*[
"$stuff:=(.*?)" |
 $stuff:=(\S+)
]};

: Is this possible, or does the underlying implementation of hypothetical
: variables pretty much rule it out?

I don't see any particular reason why a top-level regex can't refer to
variables in the surrounding scope, either by default, or via a :modifier
of some sort.  It's only down in the sub-rules that we have to make sure
there's a hash to poke such hypotheticals into.

Larry




Hypothetical synonyms

2002-08-27 Thread Aaron Sherman

I just wrote this code in Perl5:

$stuff = (defined($1)?$1:$2) if /^\s*(?:"(.*?)"|(\S+))/;

This is a common practice for me when I parse configuration and data
files whose formats I define. It's nice to be able to quote fields that
have spaces, and this is an easy way to parse the result.

In Perl6, it looks like what I would like here is very close, but I'm
not sure. Certainly, I could do:

$stuff = ($1 // $2) if m{^\s*["(.*?)"|(\S+)]};

But I would far prefer:

$stuff = $field if m{^\s*[
"(.*?)" {let $field=$1} |
(\S+)   {let $field=$2}]};

even though it's longer.

Is this possible, or does the underlying implementation of hypothetical
variables pretty much rule it out?