Re: Proposal: chop() dropped

2000-09-01 Thread Tom Christiansen

I already proposed that. Benchmarks show that reading a file with
explicit chomp() is easily 20% slower than reading the same file with
implicit chomp(), through the -l command line switch.

And what, pray tell, do you do about the small matter of wanting
to read some files without implicit record-terminator deletion, and
others with such?  Or, for that matter, applying it to something
other than the implicit ARGV reading from -n or -p?  Seems like
it should be a filehandle property.

--tom



Re: Proposal: chop() dropped

2000-08-31 Thread Nathan Torkington

Eric Roode writes:
 Useful functions all, no doubt. But I would lobby heavily for a new
 set of names -- ones that can be remembered! Quick -- which trims 
 leading spaces, champ, chump, or chimp?

My favourite: chafe().

Nat



Re: Proposal: chop() dropped

2000-08-31 Thread Dan Zetterstrom

On Thu, 31 Aug 2000 19:59:31 +0200, you wrotc:

tr/\w//dlt   # Trim all leading  trailing whitespace from $_

Eh, scratch that. Too much caffeine i guess.

tr/\n\r\t //dlt;  # Trim some whitespace.

-DZ




-- 
Tell me your dreams and I will crush them.



Re: Proposal: chop() dropped

2000-08-31 Thread Jonathan Scott Duff

On Thu, Aug 31, 2000 at 07:59:31PM +0200, Dan Zetterstrom wrote:
 Why not use the "function" we already got, tr? Something like:
 
 tr///l   # Translate only _l_eading characters matching.
 tr///t   # Translate only _t_railing characters matching.

 With "Only leading" I mean translate from start/end until you find a
 character not matching. Then you can do nifty things such as:

Um, that would radically change the meaning of tr///.  Better to use
s/^// and s/$//.

-Scott
-- 
Jonathan Scott Duff
[EMAIL PROTECTED]



Re: Proposal: chop() dropped

2000-08-31 Thread Tom Christiansen

tr///l   # Translate only _l_eading characters matching.
tr///t   # Translate only _t_railing characters matching.

With "Only leading" I mean translate from start/end until you find a
character not matching. Then you can do nifty things such as:

tr/\w//dlt   # Trim all leading  trailing whitespace from $_

tr/// does not admit character classes!

The "leading" thing can be effected with \G.  For example:

s/\G /0/g;  # change leading blanks to 0

In general, getting folks to write

s/^\s+//s;
s/\s+$//s;   # XXX: \z

is a *good* think.

--tom



Re: Proposal: chop() dropped

2000-08-31 Thread Tom Christiansen

How would you do:

# Writer insists on blank line between paragraphs, first line indented.
# Publisher insists on one paragraph/line, first word ALL CAPS.

Step 1: Fire the lame publisher.  I'm serious.  It's amazing
what people tolerate.  Some things aren't worth the pane.

{
local $/ = ""; #slurp paragraph at a time.
while (INFILE) {
   s/\n//gm;# combine into one line

No, that should be s/\n/ /g, as otherwise you merge two
words and get for example "twowords" on this paragraph.

And that /m does nothing, as there's no ^ nor $ to affect.

   s/^\s//; # get rid of indent

Um, surely you mean s/^\s+// there instead.

   y/a-z/A-Z/l  # upcase first (English) word.

Does that mean capitalize it, or reader all the word's letters
in uppercase?  I guess you said ALL CAPS, so, trivially enough,
it's merely

s/(\w+)/\U$1/;

Whatcha doin' with tr anyway? :-)  That's what the vi escapes are for!

   print OUTFILE;
}
}

--tom



Re: Proposal: chop() dropped

2000-08-31 Thread Eric Roode

TomC wrote:
In general, getting folks to write

s/^\s+//s;
s/\s+$//s;   # XXX: \z

is a *good* think.

Why?

Removing leading/trailing whitespace is a tremendously frequently-
performed task. Perl gives you -l on the command line to strip
newlines on input and add them on output, simply because that's
what you need to do, very often. 

Perl allows you to glomp your input one "paragraph" at a time,
by setting $/ to "", because it's quicker and more convenient than
searching for the pattern /\n{2,}/, and because people often want
to do that.

I'm not arguing in favor of the tr/// hack specifically, but 
gosh, wouldn't it be nice if there were a thwack() builtin that
stripped leading and trailing spaces? 

while ()
{
chomp;
thwack;  # or whatever
barf if /^barf/;
last if /!!$/;
print LOG "$_ found\n";
}

 --
 Eric J. Roode,  [EMAIL PROTECTED]   print  scalar  reverse  sort
 Senior Software Engineer'tona ', 'reh', 'ekca', 'lre',
 Myxa Corporation'.r', 'h ', 'uj', 'p ', 'ts';




Re: Proposal: chop() dropped

2000-08-31 Thread Bart Lateur

On Thu, 31 Aug 2000 13:36:10 -0600, Tom Christiansen wrote:

I'm not arguing in favor of the tr/// hack specifically, but 
gosh, wouldn't it be nice if there were a thwack() builtin that
stripped leading and trailing spaces? 

No.  People should learn intrinsic mechanisms with which they can
construct infinitely many beautiful and powerful effects.  This empowers
them.  Making them learn yet-another-function-call merely hamstrings
them with a dead fish; tomorrow, they shall starve.

Then let's drop chop(). This function is even far more easily
implemented than trim(), thwack(), or whatever you choose to call it.

Eh... full circle?

-- 
Bart.



Re: Proposal: chop() dropped

2000-08-30 Thread Jonathan Scott Duff

On Wed, Aug 30, 2000 at 02:31:00PM -0600, Nathan Torkington wrote:
 chomp() is best used for chop()s main raison d'etre, removing $/
 from a string.  I say we drop chop().

I'll second that motion.  We already have lots of ways of removing the
last character of a string if that's what we really need.

-Scott
-- 
Jonathan Scott Duff
[EMAIL PROTECTED]



Re: Proposal: chop() dropped

2000-08-30 Thread Tom Christiansen

chomp() is best used for chop()s main raison d'etre, removing $/
from a string.  I say we drop chop().

So code that says

chop($k,$v)

will need to say

for ($k,$v) { s/.\z//s } 

or else something like:

for ($k, $v) { substr($_, length() - 1) = '' }

I'm not sure I find either of those more legible.  And they certainly
won't be faster.  chop() has been around since perl1, too.

Then again, yes, people do tend to use it when they oughtn't.  Hm...

---tom



Re: Proposal: chop() dropped

2000-08-30 Thread Tom Christiansen

the reason that they're duplicatable with other features, while I want
to drop chop because its main purpose has now been replaced with the
far superior chomp.

Except that chomp() relies upon the ueberglobal $/ variable,
irrespective of the source of the data being chomped.  

--tom



Re: Proposal: chop() dropped

2000-08-30 Thread Nathan Torkington

Tom Christiansen writes:
 So code that says
 
 chop($k,$v)
 
 will need to say
 
 for ($k,$v) { s/.\z//s } 
 
 or else something like:
 
 for ($k, $v) { substr($_, length() - 1) = '' }

I don't think chop() is an operation that's done often enough for
either of the things above to be a problem.

 I'm not sure I find either of those more legible.  And they certainly
 won't be faster.  chop() has been around since perl1, too.

Yes, but chop()'s original purpose was what chomp() is now used for.
I doubt Larry would really have put in a function to remove the last
char in a string just to have a function that removes the last char in
a string.  It was the chomp()-like itch I think he was trying to
scratch.

 Then again, yes, people do tend to use it when they oughtn't.  Hm...

Exactly.

Nat



Re: Proposal: chop() dropped

2000-08-30 Thread Peter Scott

At 04:36 PM 8/30/00 -0600, Tom Christiansen wrote:
 the reason that they're duplicatable with other features, while I want
 to drop chop because its main purpose has now been replaced with the
 far superior chomp.

Except that chomp() relies upon the ueberglobal $/ variable,
irrespective of the source of the data being chomped.

I presume that line disciplines will be object-oriented and inherit from 
some base class; therefore a bare chomp will use the line terminator from 
that base class and for the others, you could do something like 
$fh-chomp($line) to do chomping specific to a particular filehandle.  Make 
sense?

--
Peter Scott
Pacific Systems Design Technologies




Re: Proposal: chop() dropped

2000-08-30 Thread Bart Lateur

On Wed, 30 Aug 2000 16:14:35 -0600, Tom Christiansen wrote:

I say we drop chop().

So code that says

chop($k,$v)

will need to say

for ($k,$v) { s/.\z//s } 

or else something like:

for ($k, $v) { substr($_, length() - 1) = '' }

I'm not sure I find either of those more legible.

I'm sure. It's not. But either of this works.

substr($_, -1) = '';

$chopped = substr $_, -1, 1, '';

chop() is a speed optimization for a (currently) pretty uncommon task:
removing the last character of a string. Why not the first character,
for example? The string equivalent of shift()?

Only people who insist that ALL lines in a text file should end in a
newline, would still prefer use it.

-- 
Bart.



Re: Proposal: chop() dropped

2000-08-30 Thread Tom Christiansen

I would actualy like to see chop expanded to allow a variable number of
characters to be removed and a sister function to cut the head off. Yes I
know you can do this with substr but sometimes when you want the performance
and need to cut up a string into fields. Lets say the new function is called
'take'.
  while($str =F)
 {
 $head = take($str,40);
 $tail = chop($str,40);
 $middle = $str;
 #assuming $str had 120 characters in it
 }

Just a thought

The current syntax is more like chop(@), not chop($;$).

--tom



Re: Proposal: chop() dropped

2000-08-30 Thread skud

On Wed, Aug 30, 2000 at 02:31:00PM -0600, Nathan Torkington wrote:
chomp() is best used for chop()s main raison d'etre, removing $/
from a string.  I say we drop chop().

Works for me.  Are you going to RFC it?

K.

-- 
Kirrily Robert -- [EMAIL PROTECTED] -- http://netizen.com.au/
Open Source development, consulting and solutions
Level 10, 500 Collins St, Melbourne VIC 3000
Phone: +61 3 9614 0949  Fax: +61 3 9614 0948  Mobile: +61 410 664 994



Re: Proposal: chop() dropped

2000-08-30 Thread Nathan Wiger

Tom Christiansen wrote:
 
 I'll second that motion.  We already have lots of ways of removing the
 last character of a string if that's what we really need.
 
 But they're slow and hard to read.

I think the word "drop" should be clarified as "dropped from the core
binary".

In a very cool email, Bryan Warnock talked about half a dozen different
chop-like functions that are useful but not in core:

http://www.mail-archive.com/perl6-language@perl.org/msg01522.html

I think chop, champ, chip, and friends should be available via the

   use String::Chop;

or related module.

-Nate