Re: Lvalue Str::words iterator

2005-06-15 Thread Larry Wall
Y'all are getting hung up on the correspondence of "words" with "word
characters", but you're ignoring the fact that most of the time people
want to do awk's version of splitting, matching \S+ words rather than \w+
words (*neither* of which actually matches what people usually mean
by words, in any case).

So I think .match should default to .match(rx:g/\S+/), if it defaults
to anything.  But I still think people would find .words much clearer
when reading someone else's code.  And the :g would be implicit,
so .words(/\w+/) would be still be a shortcut for .match(rx:g/\w+/).

Larry


Re: Lvalue Str::words iterator

2005-06-15 Thread Juerd
Ingo Blechschmidt skribis 2005-06-15 21:35 (+0200):
> So maybe we should allow words() (or however we'll end up calling it) to
> take an optional parameter specifying what's considered a wordchar,
> with a default of rx/\w+/:

Then isn't making \w+ the default for match much easier?

(Although I still want m// to correspond to .m and s/// to .s, not m//
to .match and s/// to .subst.)

I think this is so often used that the default isn't even as insane as
it may appear at first sight.

>   say "foo bar baz".words()   .join(":");# same as

say "foo bar baz".match

>   say "foo bar baz".words(rx/\w+/).join(":");# "foo:bar:baz"

say "foo bar baz".match(/\w+/)

rx is optional, as bare // does rx// in Rule context, not m//.


Juerd
-- 
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html 
http://convolution.nl/gajigu_juerd_n.html


Re: Lvalue Str::words iterator

2005-06-15 Thread Ingo Blechschmidt
Hi,

Juerd wrote:
> Ingo Blechschmidt skribis 2005-06-15 20:18 (+0200):
>> >> say join ",", @words; # "hi,my,name,is,ingo";
>> > Following the logic that .words returns the words, the words are no
>> > longer individual words when joined on comma instead of
>> > whitespace...
>> sorry, I don't quite get that.
> 
> "foo bar baz".words.join(',').words.join(':') ne 'foo:bar:baz';
> 
> so somewhere in that process, foo, bar and baz managed to no longer be
> words by the definition of words used by an on whitespace splitting
> words method - they're one word, together, when they're joined on
> comma.

ah! I understand :)

So maybe we should allow words() (or however we'll end up calling it) to
take an optional parameter specifying what's considered a wordchar,
with a default of rx/\w+/:

  say "foo bar baz".words()   .join(":");# same as
  say "foo bar baz".words(rx/\w+/).join(":");# "foo:bar:baz"

  say "foo,bar,baz".words()   .join(":");# same as
  # "," doesn't match /\w+/
  say "foo,bar,baz".words(rx/\w+/).join(":");# "foo:bar:baz"

  # Now "," is considered to be part of words:
  say "foo,bar,baz".words(rx/[\w|,]+/).join(":");# "foo,bar,baz"

  say "foo bar baz".words(rx/b../).join(":");# "bar:baz"

Then your example...
  say "foo bar baz".words.join(',').words.join(':'); # "foo bar baz";

And:
  say "foo bar baz".words.join(",").words(rx/[\w|,]+/).join(":");
 # "foo,bar,baz"


I hope these examples make sense...

--Ingo

-- 
Linux, the choice of a GNU | Perfection is reached, not when there is no
generation on a dual AMD   | longer anything to add, but when there is
Athlon!| no longer anything to take away.



Re: Lvalue Str::words iterator

2005-06-15 Thread Juerd
Ingo Blechschmidt skribis 2005-06-15 20:18 (+0200):
> >> say join ",", @words; # "hi,my,name,is,ingo";
> > Following the logic that .words returns the words, the words are no
> > longer individual words when joined on comma instead of whitespace...
> sorry, I don't quite get that.

"foo bar baz".words.join(',').words.join(':') ne 'foo:bar:baz';

so somewhere in that process, foo, bar and baz managed to no longer be
words by the definition of words used by an on whitespace splitting
words method - they're one word, together, when they're joined on comma.

It was a demonstration of why "words" for this feature is a bad name,
not anything against the presentation using commas.


Juerd
-- 
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html 
http://convolution.nl/gajigu_juerd_n.html


Re: Lvalue Str::words iterator

2005-06-15 Thread Ingo Blechschmidt
Hi,

Juerd wrote:
> Ingo Blechschmidt skribis 2005-06-15 19:14 (+0200):
>> as Larry mentioned in another thread that he wants a "different
>> notation for word splitting"
>> (http://www.nntp.perl.org/group/perl.perl6.language/21874),
>> how about that, similar to Haskell's "words" function:
> 
> "words" is wrong for something that splits.

I took the name from Haskell's words, but I don't mind a name change.

> It'd be right for something that matches. Let me demonstrate: 
> 
>    "(foo bar --baz blah-- quux)".words;
> This should return , not <(foo bar --baz blah--
> quux)>. So the name "words" isn't good for this.

right, it should return , with the additional
lvalue ability, so that the following (contrived) example works:

my $str = "(foo bar --baz blah-- quux)";
$str.words .= map:{ substr $^word, 1 };
say $str;
 # "(oo ar --az lah-- uux)";

>> say join ",", @words; # "hi,my,name,is,ingo";
> 
> Following the logic that .words returns the words, the words are no
> longer individual words when joined on comma instead of whitespace...

sorry, I don't quite get that.

I wanted to show the contents of @words, I did not want to split the
string into words and then concatenate the words again.

   # Maybe this...
   say " hi my  name is ingo ".words.map:{ "($_)" }
 # "(hi) (my) (name) (is) (ingo)"
   # ...is clearer?


--Ingo

-- 
Linux, the choice of a GNU | Black holes result when God divides the
generation on a dual AMD   | universe by zero.  
Athlon!| 



Re: Lvalue Str::words iterator

2005-06-15 Thread Juerd
Ingo Blechschmidt skribis 2005-06-15 19:14 (+0200):
> as Larry mentioned in another thread that he wants a "different
> notation for word splitting"
> (http://www.nntp.perl.org/group/perl.perl6.language/21874),
> how about that, similar to Haskell's "words" function:

"words" is wrong for something that splits. It'd be right for something
that matches. Let me demonstrate:

"(foo bar --baz blah-- quux)".words;

This should return , not <(foo bar --baz blah--
quux)>. So the name "words" isn't good for this.

> say join ",", @words; # "hi,my,name,is,ingo";

Following the logic that .words returns the words, the words are no
longer individual words when joined on comma instead of whitespace...


Juerd
-- 
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html 
http://convolution.nl/gajigu_juerd_n.html